• Michal Hocko's avatar
    mm, oom: do not trigger out_of_memory from the #PF · 60e2793d
    Michal Hocko authored
    Any allocation failure during the #PF path will return with VM_FAULT_OOM
    which in turn results in pagefault_out_of_memory.  This can happen for 2
    different reasons.  a) Memcg is out of memory and we rely on
    mem_cgroup_oom_synchronize to perform the memcg OOM handling or b)
    normal allocation fails.
    
    The latter is quite problematic because allocation paths already trigger
    out_of_memory and the page allocator tries really hard to not fail
    allocations.  Anyway, if the OOM killer has been already invoked there
    is no reason to invoke it again from the #PF path.  Especially when the
    OOM condition might be gone by that time and we have no way to find out
    other than allocate.
    
    Moreover if the allocation failed and the OOM killer hasn't been invoked
    then we are unlikely to do the right thing from the #PF context because
    we have already lost the allocation context and restictions and
    therefore might oom kill a task from a different NUMA domain.
    
    This all suggests that there is no legitimate reason to trigger
    out_of_memory from pagefault_out_of_memory so drop it.  Just to be sure
    that no #PF path returns with VM_FAULT_OOM without allocation print a
    warning that this is happening before we restart the #PF.
    
    [VvS: #PF allocation can hit into limit of cgroup v1 kmem controller.
    This is a local problem related to memcg, however, it causes unnecessary
    global OOM kills that are repeated over and over again and escalate into a
    real disaster.  This has been broken since kmem accounting has been
    introduced for cgroup v1 (3.8).  There was no kmem specific reclaim for
    the separate limit so the only way to handle kmem hard limit was to return
    with ENOMEM.  In upstream the problem will be fixed by removing the
    outdated kmem limit, however stable and LTS kernels cannot do it and are
    still affected.  This patch fixes the problem and should be backported
    into stable/LTS.]
    
    Link: https://lkml.kernel.org/r/f5fd8dd8-0ad4-c524-5f65-920b01972a42@virtuozzo.comSigned-off-by: default avatarMichal Hocko <mhocko@suse.com>
    Signed-off-by: default avatarVasily Averin <vvs@virtuozzo.com>
    Acked-by: default avatarMichal Hocko <mhocko@suse.com>
    Cc: Johannes Weiner <hannes@cmpxchg.org>
    Cc: Mel Gorman <mgorman@techsingularity.net>
    Cc: Roman Gushchin <guro@fb.com>
    Cc: Shakeel Butt <shakeelb@google.com>
    Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
    Cc: Uladzislau Rezki <urezki@gmail.com>
    Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
    Cc: Vlastimil Babka <vbabka@suse.cz>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    60e2793d
oom_kill.c 31.6 KB