• Borislav Petkov's avatar
    x86/lib/copy_page_64.S: Use generic ALTERNATIVE macro · 090a3f61
    Borislav Petkov authored
    ... instead of the semi-version with the spelled out sections.
    
    What is more, make the REP_GOOD version be the default copy_page()
    version as the majority of the relevant x86 CPUs do set
    X86_FEATURE_REP_GOOD. Thus, copy_page gets compiled to:
    
      ffffffff8130af80 <copy_page>:
      ffffffff8130af80:       e9 0b 00 00 00          jmpq   ffffffff8130af90 <copy_page_regs>
      ffffffff8130af85:       b9 00 02 00 00          mov    $0x200,%ecx
      ffffffff8130af8a:       f3 48 a5                rep movsq %ds:(%rsi),%es:(%rdi)
      ffffffff8130af8d:       c3                      retq
      ffffffff8130af8e:       66 90                   xchg   %ax,%ax
    
      ffffffff8130af90 <copy_page_regs>:
      ...
    
    and after the alternatives have run, the JMP to the old, unrolled
    version gets NOPed out:
    
      ffffffff8130af80 <copy_page>:
      ffffffff8130af80:  66 66 90		xchg   %ax,%ax
      ffffffff8130af83:  66 90		xchg   %ax,%ax
      ffffffff8130af85:  b9 00 02 00 00	mov    $0x200,%ecx
      ffffffff8130af8a:  f3 48 a5		rep movsq %ds:(%rsi),%es:(%rdi)
      ffffffff8130af8d:  c3			retq
    
    On modern uarches, those NOPs are cheaper than the unconditional JMP
    previously.
    Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
    090a3f61
copy_page_64.S 1.92 KB