• Christophe Leroy's avatar
    powerpc/lib: Adjust .balign inside string functions for PPC32 · 1128bb78
    Christophe Leroy authored
    commit 87a156fb ("Align hot loops of some string functions")
    degraded the performance of string functions by adding useless
    nops
    
    A simple benchmark on an 8xx calling 100000x a memchr() that
    matches the first byte runs in 41668 TB ticks before this patch
    and in 35986 TB ticks after this patch. So this gives an
    improvement of approx 10%
    
    Another benchmark doing the same with a memchr() matching the 128th
    byte runs in 1011365 TB ticks before this patch and 1005682 TB ticks
    after this patch, so regardless on the number of loops, removing
    those useless nops improves the test by 5683 TB ticks.
    
    Fixes: 87a156fb ("Align hot loops of some string functions")
    Signed-off-by: default avatarChristophe Leroy <christophe.leroy@c-s.fr>
    Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
    1128bb78
cache.h 2.47 KB