• Yosry Ahmed's avatar
    memcg: replace stats_flush_lock with an atomic · 3cd9992b
    Yosry Ahmed authored
    As Johannes notes in [1], stats_flush_lock is currently used to:
    (a) Protect updated to stats_flush_threshold.
    (b) Protect updates to flush_next_time.
    (c) Serializes calls to cgroup_rstat_flush() based on those ratelimits.
    
    However:
    
    1. stats_flush_threshold is already an atomic
    
    2. flush_next_time is not atomic. The writer is locked, but the reader
       is lockless. If the reader races with a flush, you could see this:
    
                                            if (time_after(jiffies, flush_next_time))
            spin_trylock()
            flush_next_time = now + delay
            flush()
            spin_unlock()
                                            spin_trylock()
                                            flush_next_time = now + delay
                                            flush()
                                            spin_unlock()
    
       which means we already can get flushes at a higher frequency than
       FLUSH_TIME during races. But it isn't really a problem.
    
       The reader could also see garbled partial updates if the compiler
       decides to split the write, so it needs at least READ_ONCE and
       WRITE_ONCE protection.
    
    3. Serializing cgroup_rstat_flush() calls against the ratelimit
       factors is currently broken because of the race in 2. But the race
       is actually harmless, all we might get is the occasional earlier
       flush. If there is no delta, the flush won't do much. And if there
       is, the flush is justified.
    
    So the lock can be removed all together. However, the lock also served
    the purpose of preventing a thundering herd problem for concurrent
    flushers, see [2]. Use an atomic instead to serve the purpose of
    unifying concurrent flushers.
    
    [1]https://lore.kernel.org/lkml/20230323172732.GE739026@cmpxchg.org/
    [2]https://lore.kernel.org/lkml/20210716212137.1391164-2-shakeelb@google.com/
    
    Link: https://lkml.kernel.org/r/20230330191801.1967435-5-yosryahmed@google.comSigned-off-by: default avatarYosry Ahmed <yosryahmed@google.com>
    Acked-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
    Acked-by: default avatarShakeel Butt <shakeelb@google.com>
    Acked-by: default avatarMichal Hocko <mhocko@suse.com>
    Cc: Jens Axboe <axboe@kernel.dk>
    Cc: Josef Bacik <josef@toxicpanda.com>
    Cc: Michal Hocko <mhocko@kernel.org>
    Cc: Michal Koutný <mkoutny@suse.com>
    Cc: Muchun Song <muchun.song@linux.dev>
    Cc: Roman Gushchin <roman.gushchin@linux.dev>
    Cc: Tejun Heo <tj@kernel.org>
    Cc: Vasily Averin <vasily.averin@linux.dev>
    Cc: Zefan Li <lizefan.x@bytedance.com>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    3cd9992b
memcontrol.c 203 KB