• Heiko Carstens's avatar
    [S390] convert/optimize csum_fold() to C · 04efc3be
    Heiko Carstens authored
    In the meantime gcc generates better code than the old inline
    assemblies do. Original inline assembly results in:
    
    lr	%r1,%r2
    sr	%r3,%r3
    lr	%r2,%r1
    srdl	%r2,16
    alr	%r2,%r3
    alr	%r1,%r2
    srl	%r1,16
    xilf	%r1,65535
    llghr	%r2,%r1
    br	%r14
    
    Out of the C code gcc generates this:
    
    rll	%r1,%r2,16
    ar	%r1,%r2
    srl	%r1,16
    xilf	%r1,65535
    llghr	%r2,%r1
    br	%r14
    
    In addition we don't have any static register allocations anymore and
    gcc is free to shuffle instructions around for better pipeline usage.
    Signed-off-by: default avatarHeiko Carstens <heiko.carstens@de.ibm.com>
    Signed-off-by: default avatarMartin Schwidefsky <schwidefsky@de.ibm.com>
    04efc3be
checksum.h 3.52 KB