• Andi Kleen's avatar
    HWPOISON: add memory cgroup filter · 4fd466eb
    Andi Kleen authored
    The hwpoison test suite need to inject hwpoison to a collection of
    selected task pages, and must not touch pages not owned by them and
    thus kill important system processes such as init. (But it's OK to
    mis-hwpoison free/unowned pages as well as shared clean pages.
    Mis-hwpoison of shared dirty pages will kill all tasks, so the test
    suite will target all or non of such tasks in the first place.)
    
    The memory cgroup serves this purpose well. We can put the target
    processes under the control of a memory cgroup, and tell the hwpoison
    injection code to only kill pages associated with some active memory
    cgroup.
    
    The prerequisite for doing hwpoison stress tests with mem_cgroup is,
    the mem_cgroup code tracks task pages _accurately_ (unless page is
    locked).  Which we believe is/should be true.
    
    The benefits are simplification of hwpoison injector code. Also the
    mem_cgroup code will automatically be tested by hwpoison test cases.
    
    The alternative interfaces pin-pfn/unpin-pfn can also delegate the
    (process and page flags) filtering functions reliably to user space.
    However prototype implementation shows that this scheme adds more
    complexity than we wanted.
    
    Example test case:
    
    	mkdir /cgroup/hwpoison
    
    	usemem -m 100 -s 1000 &
    	echo `jobs -p` > /cgroup/hwpoison/tasks
    
    	memcg_ino=$(ls -id /cgroup/hwpoison | cut -f1 -d' ')
    	echo $memcg_ino > /debug/hwpoison/corrupt-filter-memcg
    
    	page-types -p `pidof init`   --hwpoison  # shall do nothing
    	page-types -p `pidof usemem` --hwpoison  # poison its pages
    
    [AK: Fix documentation]
    [Add fix for problem noticed by Li Zefan <lizf@cn.fujitsu.com>;
    dentry in the css could be NULL]
    
    CC: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
    CC: Hugh Dickins <hugh.dickins@tiscali.co.uk>
    CC: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
    CC: Balbir Singh <balbir@linux.vnet.ibm.com>
    CC: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
    CC: Li Zefan <lizf@cn.fujitsu.com>
    CC: Paul Menage <menage@google.com>
    CC: Nick Piggin <npiggin@suse.de>
    CC: Andi Kleen <andi@firstfloor.org>
    Signed-off-by: default avatarWu Fengguang <fengguang.wu@intel.com>
    Signed-off-by: default avatarAndi Kleen <ak@linux.intel.com>
    4fd466eb
memory-failure.c 28.8 KB