• David Rientjes's avatar
    mm, memcg: give exiting processes access to memory reserves · 465adcf1
    David Rientjes authored
    A memcg may livelock when oom if the process that grabs the hierarchy's
    oom lock is never the first process with PF_EXITING set in the memcg's
    task iteration.
    
    The oom killer, both global and memcg, will defer if it finds an
    eligible process that is in the process of exiting and it is not being
    ptraced.  The idea is to allow it to exit without using memory reserves
    before needlessly killing another process.
    
    This normally works fine except in the memcg case with a large number of
    threads attached to the oom memcg.  In this case, the memcg oom killer
    only gets called for the process that grabs the hierarchy's oom lock;
    all others end up blocked on the memcg's oom waitqueue.  Thus, if the
    process that grabs the hierarchy's oom lock is never the first
    PF_EXITING process in the memcg's task iteration, the oom killer is
    constantly deferred without anything making progress.
    
    The fix is to give PF_EXITING processes access to memory reserves so
    that we've marked them as oom killed without any iteration.  This allows
    __mem_cgroup_try_charge() to succeed so that the process may exit.  This
    makes the memcg oom killer exemption for TIF_MEMDIE tasks, now
    immediately granted for processes with pending SIGKILLs and those in the
    exit path, to be equivalent to what is done for the global oom killer.
    Signed-off-by: default avatarDavid Rientjes <rientjes@google.com>
    Acked-by: default avatarMichal Hocko <mhocko@suse.cz>
    Acked-by: default avatarKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
    Acked-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    465adcf1
memcontrol.c 184 KB