• Nhat Pham's avatar
    zswap: shrink zswap pool based on memory pressure · b5ba474f
    Nhat Pham authored
    Currently, we only shrink the zswap pool when the user-defined limit is
    hit.  This means that if we set the limit too high, cold data that are
    unlikely to be used again will reside in the pool, wasting precious
    memory.  It is hard to predict how much zswap space will be needed ahead
    of time, as this depends on the workload (specifically, on factors such as
    memory access patterns and compressibility of the memory pages).
    
    This patch implements a memcg- and NUMA-aware shrinker for zswap, that is
    initiated when there is memory pressure.  The shrinker does not have any
    parameter that must be tuned by the user, and can be opted in or out on a
    per-memcg basis.
    
    Furthermore, to make it more robust for many workloads and prevent
    overshrinking (i.e evicting warm pages that might be refaulted into
    memory), we build in the following heuristics:
    
    * Estimate the number of warm pages residing in zswap, and attempt to
      protect this region of the zswap LRU.
    * Scale the number of freeable objects by an estimate of the memory
      saving factor. The better zswap compresses the data, the fewer pages
      we will evict to swap (as we will otherwise incur IO for relatively
      small memory saving).
    * During reclaim, if the shrinker encounters a page that is also being
      brought into memory, the shrinker will cautiously terminate its
      shrinking action, as this is a sign that it is touching the warmer
      region of the zswap LRU.
    
    As a proof of concept, we ran the following synthetic benchmark: build the
    linux kernel in a memory-limited cgroup, and allocate some cold data in
    tmpfs to see if the shrinker could write them out and improved the overall
    performance.  Depending on the amount of cold data generated, we observe
    from 14% to 35% reduction in kernel CPU time used in the kernel builds.
    
    [nphamcs@gmail.com: check shrinker enablement early, use less costly stat flushing]
      Link: https://lkml.kernel.org/r/20231206194456.3234203-1-nphamcs@gmail.com
    Link: https://lkml.kernel.org/r/20231130194023.4102148-7-nphamcs@gmail.comSigned-off-by: default avatarNhat Pham <nphamcs@gmail.com>
    Acked-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
    Tested-by: default avatarBagas Sanjaya <bagasdotme@gmail.com>
    Cc: Chris Li <chrisl@kernel.org>
    Cc: Dan Streetman <ddstreet@ieee.org>
    Cc: Domenico Cerasuolo <cerasuolodomenico@gmail.com>
    Cc: Michal Hocko <mhocko@kernel.org>
    Cc: Muchun Song <muchun.song@linux.dev>
    Cc: Roman Gushchin <roman.gushchin@linux.dev>
    Cc: Seth Jennings <sjenning@redhat.com>
    Cc: Shakeel Butt <shakeelb@google.com>
    Cc: Shuah Khan <shuah@kernel.org>
    Cc: Vitaly Wool <vitaly.wool@konsulko.com>
    Cc: Yosry Ahmed <yosryahmed@google.com>
    Cc: Chengming Zhou <chengming.zhou@linux.dev>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    b5ba474f
zswap.rst 7.9 KB