• Glauber Costa's avatar
    memcg: prevent changes to move_charge_at_immigrate during task attach · ee5e8472
    Glauber Costa authored
    In memcg, we use the cgroup_lock basically to synchronize against
    attaching new children to a cgroup.  We do this because we rely on
    cgroup core to provide us with this information.
    
    We need to guarantee that upon child creation, our tunables are
    consistent.  For those, the calls to cgroup_lock() all live in handlers
    like mem_cgroup_hierarchy_write(), where we change a tunable in the
    group that is hierarchy-related.  For instance, the use_hierarchy flag
    cannot be changed if the cgroup already have children.
    
    Furthermore, those values are propagated from the parent to the child
    when a new child is created.  So if we don't lock like this, we can end
    up with the following situation:
    
    A                                   B
     memcg_css_alloc()                       mem_cgroup_hierarchy_write()
     copy use hierarchy from parent          change use hierarchy in parent
     finish creation.
    
    This is mainly because during create, we are still not fully connected
    to the css tree.  So all iterators and the such that we could use, will
    fail to show that the group has children.
    
    My observation is that all of creation can proceed in parallel with
    those tasks, except value assignment.  So what this patch series does is
    to first move all value assignment that is dependent on parent values
    from css_alloc to css_online, where the iterators all work, and then we
    lock only the value assignment.  This will guarantee that parent and
    children always have consistent values.  Together with an online test,
    that can be derived from the observation that the refcount of an online
    memcg can be made to be always positive, we should be able to
    synchronize our side without the cgroup lock.
    
    This patch:
    
    Currently, we rely on the cgroup_lock() to prevent changes to
    move_charge_at_immigrate during task migration.  However, this is only
    needed because the current strategy keeps checking this value throughout
    the whole process.  Since all we need is serialization, one needs only
    to guarantee that whatever decision we made in the beginning of a
    specific migration is respected throughout the process.
    
    We can achieve this by just saving it in mc.  By doing this, no kind of
    locking is needed.
    Signed-off-by: default avatarGlauber Costa <glommer@parallels.com>
    Acked-by: default avatarMichal Hocko <mhocko@suse.cz>
    Cc: Tejun Heo <tj@kernel.org>
    Cc: Hiroyuki Kamezawa <kamezawa.hiroyuki@gmail.com>
    Cc: Johannes Weiner <hannes@cmpxchg.org>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    ee5e8472
memcontrol.c 180 KB