• Andi Kleen's avatar
    [PATCH] Runtime memory barrier patching · 8aba0a3d
    Andi Kleen authored
    This implements automatic code patching of memory barriers based
    on the CPU capabilities. Normally lock ; addl $0,(%esp) barriers
    are used, but these are a bit slow on the Pentium 4.
    
    Linus proposed this a few weeks ago after the support for SSE1/SSE2
    barriers was introduced. I now got around to implement it.
    
    The main advantage is that it allows distributors to ship less binary
    kernels but still get fast kernels. In particular it avoids the
    need of a special Pentium 4 kernel.
    
    The patching code is quite generic and could be used to patch
    other instructions (like prefetches or specific other critical
    instructions) too.
    Thanks to Rusty's in kernel loader it also works seamlessly for modules.
    
    The patching is done before other CPUs start to avoid potential
    erratas with self modifying code on SMP systems. It makes no
    attempt to automatically handle assymetric systems (an secondary
    CPU having less capabilities than the boot CPU). In this
    case just boot with "noreplacement"
    8aba0a3d
module.c 3.38 KB