• KAMEZAWA Hiroyuki's avatar
    memcg: avoid lock in updating file_mapped (Was fix race in file_mapped accouting flag management · 32047e2a
    KAMEZAWA Hiroyuki authored
    At accounting file events per memory cgroup, we need to find memory cgroup
    via page_cgroup->mem_cgroup.  Now, we use lock_page_cgroup() for guarantee
    pc->mem_cgroup is not overwritten while we make use of it.
    
    But, considering the context which page-cgroup for files are accessed,
    we can use alternative light-weight mutual execusion in the most case.
    
    At handling file-caches, the only race we have to take care of is "moving"
    account, IOW, overwriting page_cgroup->mem_cgroup.  (See comment in the
    patch)
    
    Unlike charge/uncharge, "move" happens not so frequently. It happens only when
    rmdir() and task-moving (with a special settings.)
    This patch adds a race-checker for file-cache-status accounting v.s. account
    moving. The new per-cpu-per-memcg counter MEM_CGROUP_ON_MOVE is added.
    The routine for account move
      1. Increment it before start moving
      2. Call synchronize_rcu()
      3. Decrement it after the end of moving.
    By this, file-status-counting routine can check it needs to call
    lock_page_cgroup(). In most case, I doesn't need to call it.
    
    Following is a perf data of a process which mmap()/munmap 32MB of file cache
    in a minute.
    
    Before patch:
        28.25%     mmap  mmap               [.] main
        22.64%     mmap  [kernel.kallsyms]  [k] page_fault
         9.96%     mmap  [kernel.kallsyms]  [k] mem_cgroup_update_file_mapped
         3.67%     mmap  [kernel.kallsyms]  [k] filemap_fault
         3.50%     mmap  [kernel.kallsyms]  [k] unmap_vmas
         2.99%     mmap  [kernel.kallsyms]  [k] __do_fault
         2.76%     mmap  [kernel.kallsyms]  [k] find_get_page
    
    After patch:
        30.00%     mmap  mmap               [.] main
        23.78%     mmap  [kernel.kallsyms]  [k] page_fault
         5.52%     mmap  [kernel.kallsyms]  [k] mem_cgroup_update_file_mapped
         3.81%     mmap  [kernel.kallsyms]  [k] unmap_vmas
         3.26%     mmap  [kernel.kallsyms]  [k] find_get_page
         3.18%     mmap  [kernel.kallsyms]  [k] __do_fault
         3.03%     mmap  [kernel.kallsyms]  [k] filemap_fault
         2.40%     mmap  [kernel.kallsyms]  [k] handle_mm_fault
         2.40%     mmap  [kernel.kallsyms]  [k] do_page_fault
    
    This patch reduces memcg's cost to some extent.
    (mem_cgroup_update_file_mapped is called by both of map/unmap)
    
    Note: It seems some more improvements are required..but no idea.
          maybe removing set/unset flag is required.
    Signed-off-by: default avatarKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
    Reviewed-by: default avatarDaisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
    Cc: Balbir Singh <balbir@in.ibm.com>
    Cc: Greg Thelen <gthelen@google.com>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    32047e2a
memcontrol.c 123 KB