• KAMEZAWA Hiroyuki's avatar
    memcg: move all acccounting to parent at rmdir() · f817ed48
    KAMEZAWA Hiroyuki authored
    This patch provides a function to move account information of a page
    between mem_cgroups and rewrite force_empty to make use of this.
    
    This moving of page_cgroup is done under
     - lru_lock of source/destination mem_cgroup is held.
     - lock_page_cgroup() is held.
    
    Then, a routine which touches pc->mem_cgroup without lock_page_cgroup()
    should confirm pc->mem_cgroup is still valid or not.  Typical code can be
    following.
    
    (while page is not under lock_page())
    	mem = pc->mem_cgroup;
    	mz = page_cgroup_zoneinfo(pc)
    	spin_lock_irqsave(&mz->lru_lock);
    	if (pc->mem_cgroup == mem)
    		...../* some list handling */
    	spin_unlock_irqrestore(&mz->lru_lock);
    
    Of course, better way is
    	lock_page_cgroup(pc);
    	....
    	unlock_page_cgroup(pc);
    
    But you should confirm the nest of lock and avoid deadlock.
    
    If you treats page_cgroup from mem_cgroup's LRU under mz->lru_lock,
    you don't have to worry about what pc->mem_cgroup points to.
    moved pages are added to head of lru, not to tail.
    
    Expected users of this routine is:
      - force_empty (rmdir)
      - moving tasks between cgroup (for moving account information.)
      - hierarchy (maybe useful.)
    
    force_empty(rmdir) uses this move_account and move pages to its parent.
    This "move" will not cause OOM (I added "oom" parameter to try_charge().)
    
    If the parent is busy (not enough memory), force_empty calls try_to_free_page()
    and reduce usage.
    
    Purpose of this behavior is
      - Fix "forget all" behavior of force_empty and avoid leak of accounting.
      - By "moving first, free if necessary", keep pages on memory as much as
        possible.
    
    Adding a switch to change behavior of force_empty to
      - free first, move if necessary
      - free all, if there is mlocked/busy pages, return -EBUSY.
    is under consideration. (I'll add if someone requtests.)
    
    This patch also removes memory.force_empty file, a brutal debug-only interface.
    Reviewed-by: default avatarDaisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
    Tested-by: default avatarDaisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
    Signed-off-by: default avatarKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
    Cc: Balbir Singh <balbir@in.ibm.com>
    Cc: Paul Menage <menage@google.com>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    f817ed48
memory.txt 10.7 KB