• Suresh Siddha's avatar
    x86: fix NULL pointer deref in __switch_to · 54481cf8
    Suresh Siddha authored
    I am able to reproduce the oops reported by Simon in __switch_to() with
    lguest.
    
    My debug showed that there is at least one lguest specific
    issue (which should be present in 2.6.25 and before aswell) and it got
    exposed with a kernel oops with the recent fpu dynamic allocation patches.
    
    In addition to the previous possible scenario (with fpu_counter), in the
    presence of lguest, it is possible that the cpu's TS bit it still set and the
    lguest launcher task's thread_info has TS_USEDFPU still set.
    
    This is because of the way the lguest launcher handling the guest's TS bit.
    (look at lguest_set_ts() in lguest_arch_run_guest()). This can result
    in a DNA fault while doing unlazy_fpu() in __switch_to(). This will
    end up causing a DNA fault in the context of new process thats
    getting context switched in (as opossed to handling DNA fault in the context
    of lguest launcher/helper process).
    
    This is wrong in both pre and post 2.6.25 kernels. In the recent
    2.6.26-rc series, this is showing up as NULL pointer dereferences or
    sleeping function called from atomic context(__switch_to()), as
    we free and dynamically allocate the FPU context for the newly
    created threads. Older kernels might show some FPU corruption for processes
    running inside of lguest.
    
    With the appended patch, my test system is running for more than 50 mins
    now. So atleast some of your oops (hopefully all!) should get fixed.
    Please give it a try. I will spend more time with this fix tomorrow.
    Reported-by: default avatarSimon Holm Thøgersen <odie@cs.aau.dk>
    Reported-by: default avatarPatrick McHardy <kaber@trash.net>
    Signed-off-by: default avatarSuresh Siddha <suresh.b.siddha@intel.com>
    Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
    54481cf8
core.c 22.3 KB