• Jan Beulich's avatar
    x86: atomic64 assembly improvements · cb8095bb
    Jan Beulich authored
    In the "xchg" implementation, %ebx and %ecx don't need to be copied
    into %eax and %edx respectively (this is only necessary when desiring
    to only read the stored value).
    
    In the "add_unless" implementation, swapping the use of %ecx and %esi
    for passing arguments allows %esi to become an input only (i.e.
    permitting the register to be re-used to address the same object
    without reload).
    
    In "{add,sub}_return", doing the initial read64 through the passed in
    %ecx decreases a register dependency.
    
    In "inc_not_zero", a branch can be eliminated by or-ing together the
    two halves of the current (64-bit) value, and code size can be further
    reduced by adjusting the arithmetic slightly.
    
    v2: Undo the folding of "xchg" and "set".
    Signed-off-by: default avatarJan Beulich <jbeulich@suse.com>
    Link: http://lkml.kernel.org/r/4F19A2BC020000780006E0DC@nat28.tlf.novell.com
    Cc: Luca Barbieri <luca@luca-barbieri.com>
    Cc: Eric Dumazet <eric.dumazet@gmail.com>
    Signed-off-by: default avatarH. Peter Anvin <hpa@linux.intel.com>
    cb8095bb
atomic64_cx8_32.S 3.1 KB