1. 31 Aug, 2019 3 commits
    • Roman Gushchin's avatar
      mm, memcg: partially revert "mm/memcontrol.c: keep local VM counters in sync... · b4c46484
      Roman Gushchin authored
      mm, memcg: partially revert "mm/memcontrol.c: keep local VM counters in sync with the hierarchical ones"
      
      Commit 766a4c19 ("mm/memcontrol.c: keep local VM counters in sync
      with the hierarchical ones") effectively decreased the precision of
      per-memcg vmstats_local and per-memcg-per-node lruvec percpu counters.
      
      That's good for displaying in memory.stat, but brings a serious
      regression into the reclaim process.
      
      One issue I've discovered and debugged is the following:
      lruvec_lru_size() can return 0 instead of the actual number of pages in
      the lru list, preventing the kernel to reclaim last remaining pages.
      Result is yet another dying memory cgroups flooding.  The opposite is
      also happening: scanning an empty lru list is the waste of cpu time.
      
      Also, inactive_list_is_low() can return incorrect values, preventing the
      active lru from being scanned and freed.  It can fail both because the
      size of active and inactive lists are inaccurate, and because the number
      of workingset refaults isn't precise.  In other words, the result is
      pretty random.
      
      I'm not sure, if using the approximate number of slab pages in
      count_shadow_number() is acceptable, but issues described above are
      enough to partially revert the patch.
      
      Let's keep per-memcg vmstat_local batched (they are only used for
      displaying stats to the userspace), but keep lruvec stats precise.  This
      change fixes the dead memcg flooding on my setup.
      
      Link: http://lkml.kernel.org/r/20190817004726.2530670-1-guro@fb.com
      Fixes: 766a4c19 ("mm/memcontrol.c: keep local VM counters in sync with the hierarchical ones")
      Signed-off-by: default avatarRoman Gushchin <guro@fb.com>
      Acked-by: default avatarYafang Shao <laoar.shao@gmail.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      b4c46484
    • Andrew Morton's avatar
      mm/zsmalloc.c: fix build when CONFIG_COMPACTION=n · 441e254c
      Andrew Morton authored
      Fixes: 701d6785 ("mm/zsmalloc.c: fix race condition in zs_destroy_pool")
      Link: http://lkml.kernel.org/r/201908251039.5oSbEEUT%25lkp@intel.comReported-by: default avatarkbuild test robot <lkp@intel.com>
      Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Cc: Henry Burns <henrywolfeburns@gmail.com>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Shakeel Butt <shakeelb@google.com>
      Cc: Jonathan Adams <jwadams@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      441e254c
    • Roman Gushchin's avatar
      mm: memcontrol: flush percpu slab vmstats on kmem offlining · bee07b33
      Roman Gushchin authored
      I've noticed that the "slab" value in memory.stat is sometimes 0, even
      if some children memory cgroups have a non-zero "slab" value.  The
      following investigation showed that this is the result of the kmem_cache
      reparenting in combination with the per-cpu batching of slab vmstats.
      
      At the offlining some vmstat value may leave in the percpu cache, not
      being propagated upwards by the cgroup hierarchy.  It means that stats
      on ancestor levels are lower than actual.  Later when slab pages are
      released, the precise number of pages is substracted on the parent
      level, making the value negative.  We don't show negative values, 0 is
      printed instead.
      
      To fix this issue, let's flush percpu slab memcg and lruvec stats on
      memcg offlining.  This guarantees that numbers on all ancestor levels
      are accurate and match the actual number of outstanding slab pages.
      
      Link: http://lkml.kernel.org/r/20190819202338.363363-3-guro@fb.com
      Fixes: fb2f2b0a ("mm: memcg/slab: reparent memcg kmem_caches on cgroup removal")
      Signed-off-by: default avatarRoman Gushchin <guro@fb.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      bee07b33
  2. 30 Aug, 2019 17 commits
  3. 29 Aug, 2019 8 commits
  4. 28 Aug, 2019 6 commits
  5. 27 Aug, 2019 6 commits