• Wanpeng Li's avatar
    sched: Fix unreleased llc_shared_mask bit during CPU hotplug · fdb7a047
    Wanpeng Li authored
    commit 03bd4e1f upstream.
    
    The following bug can be triggered by hot adding and removing a large number of
    xen domain0's vcpus repeatedly:
    
    	BUG: unable to handle kernel NULL pointer dereference at 0000000000000004 IP: [..] find_busiest_group
    	PGD 5a9d5067 PUD 13067 PMD 0
    	Oops: 0000 [#3] SMP
    	[...]
    	Call Trace:
    	load_balance
    	? _raw_spin_unlock_irqrestore
    	idle_balance
    	__schedule
    	schedule
    	schedule_timeout
    	? lock_timer_base
    	schedule_timeout_uninterruptible
    	msleep
    	lock_device_hotplug_sysfs
    	online_store
    	dev_attr_store
    	sysfs_write_file
    	vfs_write
    	SyS_write
    	system_call_fastpath
    
    Last level cache shared mask is built during CPU up and the
    build_sched_domain() routine takes advantage of it to setup
    the sched domain CPU topology.
    
    However, llc_shared_mask is not released during CPU disable,
    which leads to an invalid sched domainCPU topology.
    
    This patch fix it by releasing the llc_shared_mask correctly
    during CPU disable.
    
    Yasuaki also reported that this can happen on real hardware:
    
      https://lkml.org/lkml/2014/7/22/1018
    
    His case is here:
    
    	==
    	Here is an example on my system.
    	My system has 4 sockets and each socket has 15 cores and HT is
    	enabled. In this case, each core of sockes is numbered as
    	follows:
    
    		 | CPU#
    	Socket#0 | 0-14 , 60-74
    	Socket#1 | 15-29, 75-89
    	Socket#2 | 30-44, 90-104
    	Socket#3 | 45-59, 105-119
    
    	Then llc_shared_mask of CPU#30 has 0x3fff80000001fffc0000000.
    
    	It means that last level cache of Socket#2 is shared with
    	CPU#30-44 and 90-104.
    
    	When hot-removing socket#2 and #3, each core of sockets is
    	numbered as follows:
    
    		 | CPU#
    	Socket#0 | 0-14 , 60-74
    	Socket#1 | 15-29, 75-89
    
    	But llc_shared_mask is not cleared. So llc_shared_mask of CPU#30
    	remains having 0x3fff80000001fffc0000000.
    
    	After that, when hot-adding socket#2 and #3, each core of
    	sockets is numbered as follows:
    
    		 | CPU#
    	Socket#0 | 0-14 , 60-74
    	Socket#1 | 15-29, 75-89
    	Socket#2 | 30-59
    	Socket#3 | 90-119
    
    	Then llc_shared_mask of CPU#30 becomes
    	0x3fff8000fffffffc0000000. It means that last level cache of
    	Socket#2 is shared with CPU#30-59 and 90-104. So the mask has
    	the wrong value.
    Signed-off-by: default avatarWanpeng Li <wanpeng.li@linux.intel.com>
    Tested-by: default avatarLinn Crosetto <linn@hp.com>
    Reviewed-by: default avatarBorislav Petkov <bp@suse.de>
    Reviewed-by: default avatarToshi Kani <toshi.kani@hp.com>
    Reviewed-by: default avatarYasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
    Cc: David Rientjes <rientjes@google.com>
    Cc: Prarit Bhargava <prarit@redhat.com>
    Cc: Steven Rostedt <srostedt@redhat.com>
    Cc: Peter Zijlstra <peterz@infradead.org>
    Link: http://lkml.kernel.org/r/1411547885-48165-1-git-send-email-wanpeng.li@linux.intel.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
    Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
    fdb7a047
smpboot.c 35.4 KB