• Yu Zhao's avatar
    mm: multi-gen LRU: optimize multiple memcgs · f76c8337
    Yu Zhao authored
    When multiple memcgs are available, it is possible to use generations as a
    frame of reference to make better choices and improve overall performance
    under global memory pressure.  This patch adds a basic optimization to
    select memcgs that can drop single-use unmapped clean pages first.  Doing
    so reduces the chance of going into the aging path or swapping, which can
    be costly.
    
    A typical example that benefits from this optimization is a server running
    mixed types of workloads, e.g., heavy anon workload in one memcg and heavy
    buffered I/O workload in the other.
    
    Though this optimization can be applied to both kswapd and direct reclaim,
    it is only added to kswapd to keep the patchset manageable.  Later
    improvements may cover the direct reclaim path.
    
    While ensuring certain fairness to all eligible memcgs, proportional scans
    of individual memcgs also require proper backoff to avoid overshooting
    their aggregate reclaim target by too much.  Otherwise it can cause high
    direct reclaim latency.  The conditions for backoff are:
    
    1. At low priorities, for direct reclaim, if aging fairness or direct
       reclaim latency is at risk, i.e., aging one memcg multiple times or
       swapping after the target is met.
    2. At high priorities, for global reclaim, if per-zone free pages are
       above respective watermarks.
    
    Server benchmark results:
      Mixed workloads:
        fio (buffered I/O): +[19, 21]%
                    IOPS         BW
          patch1-8: 1880k        7343MiB/s
          patch1-9: 2252k        8796MiB/s
    
        memcached (anon): +[119, 123]%
                    Ops/sec      KB/sec
          patch1-8: 862768.65    33514.68
          patch1-9: 1911022.12   74234.54
    
      Mixed workloads:
        fio (buffered I/O): +[75, 77]%
                    IOPS         BW
          5.19-rc1: 1279k        4996MiB/s
          patch1-9: 2252k        8796MiB/s
    
        memcached (anon): +[13, 15]%
                    Ops/sec      KB/sec
          5.19-rc1: 1673524.04   65008.87
          patch1-9: 1911022.12   74234.54
    
      Configurations:
        (changes since patch 6)
    
        cat mixed.sh
        modprobe brd rd_nr=2 rd_size=56623104
    
        swapoff -a
        mkswap /dev/ram0
        swapon /dev/ram0
    
        mkfs.ext4 /dev/ram1
        mount -t ext4 /dev/ram1 /mnt
    
        memtier_benchmark -S /var/run/memcached/memcached.sock \
          -P memcache_binary -n allkeys --key-minimum=1 \
          --key-maximum=50000000 --key-pattern=P:P -c 1 -t 36 \
          --ratio 1:0 --pipeline 8 -d 2000
    
        fio -name=mglru --numjobs=36 --directory=/mnt --size=1408m \
          --buffered=1 --ioengine=io_uring --iodepth=128 \
          --iodepth_batch_submit=32 --iodepth_batch_complete=32 \
          --rw=randread --random_distribution=random --norandommap \
          --time_based --ramp_time=10m --runtime=90m --group_reporting &
        pid=$!
    
        sleep 200
    
        memtier_benchmark -S /var/run/memcached/memcached.sock \
          -P memcache_binary -n allkeys --key-minimum=1 \
          --key-maximum=50000000 --key-pattern=R:R -c 1 -t 36 \
          --ratio 0:1 --pipeline 8 --randomize --distinct-client-seed
    
        kill -INT $pid
        wait
    
    Client benchmark results:
      no change (CONFIG_MEMCG=n)
    
    Link: https://lkml.kernel.org/r/20220918080010.2920238-10-yuzhao@google.comSigned-off-by: default avatarYu Zhao <yuzhao@google.com>
    Acked-by: default avatarBrian Geffon <bgeffon@google.com>
    Acked-by: default avatarJan Alexander Steffens (heftig) <heftig@archlinux.org>
    Acked-by: default avatarOleksandr Natalenko <oleksandr@natalenko.name>
    Acked-by: default avatarSteven Barrett <steven@liquorix.net>
    Acked-by: default avatarSuleiman Souhlal <suleiman@google.com>
    Tested-by: default avatarDaniel Byrne <djbyrne@mtu.edu>
    Tested-by: default avatarDonald Carr <d@chaos-reins.com>
    Tested-by: default avatarHolger Hoffstätte <holger@applied-asynchrony.com>
    Tested-by: default avatarKonstantin Kharlamov <Hi-Angel@yandex.ru>
    Tested-by: default avatarShuang Zhai <szhai2@cs.rochester.edu>
    Tested-by: default avatarSofia Trinh <sofia.trinh@edi.works>
    Tested-by: default avatarVaibhav Jain <vaibhav@linux.ibm.com>
    Cc: Andi Kleen <ak@linux.intel.com>
    Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
    Cc: Barry Song <baohua@kernel.org>
    Cc: Catalin Marinas <catalin.marinas@arm.com>
    Cc: Dave Hansen <dave.hansen@linux.intel.com>
    Cc: Hillf Danton <hdanton@sina.com>
    Cc: Jens Axboe <axboe@kernel.dk>
    Cc: Johannes Weiner <hannes@cmpxchg.org>
    Cc: Jonathan Corbet <corbet@lwn.net>
    Cc: Linus Torvalds <torvalds@linux-foundation.org>
    Cc: Matthew Wilcox <willy@infradead.org>
    Cc: Mel Gorman <mgorman@suse.de>
    Cc: Miaohe Lin <linmiaohe@huawei.com>
    Cc: Michael Larabel <Michael@MichaelLarabel.com>
    Cc: Michal Hocko <mhocko@kernel.org>
    Cc: Mike Rapoport <rppt@kernel.org>
    Cc: Mike Rapoport <rppt@linux.ibm.com>
    Cc: Peter Zijlstra <peterz@infradead.org>
    Cc: Qi Zheng <zhengqi.arch@bytedance.com>
    Cc: Tejun Heo <tj@kernel.org>
    Cc: Vlastimil Babka <vbabka@suse.cz>
    Cc: Will Deacon <will@kernel.org>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    f76c8337
vmscan.c 194 KB