• Johannes Weiner's avatar
    mm: memcontrol: reclaim and OOM kill when shrinking memory.max below usage · b6e6edcf
    Johannes Weiner authored
    Setting the original memory.limit_in_bytes hardlimit is subject to a
    race condition when the desired value is below the current usage.  The
    code tries a few times to first reclaim and then see if the usage has
    dropped to where we would like it to be, but there is no locking, and
    the workload is free to continue making new charges up to the old limit.
    Thus, attempting to shrink a workload relies on pure luck and hope that
    the workload happens to cooperate.
    
    To fix this in the cgroup2 memory.max knob, do it the other way round:
    set the limit first, then try enforcement.  And if reclaim is not able
    to succeed, trigger OOM kills in the group.  Keep going until the new
    limit is met, we run out of OOM victims and there's only unreclaimable
    memory left, or the task writing to memory.max is killed.  This allows
    users to shrink groups reliably, and the behavior is consistent with
    what happens when new charges are attempted in excess of memory.max.
    Signed-off-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
    Acked-by: default avatarMichal Hocko <mhocko@suse.com>
    Cc: Vladimir Davydov <vdavydov@virtuozzo.com>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    b6e6edcf
memcontrol.c 151 KB