• Simon Guo's avatar
    powerpc/64: add 32 bytes prechecking before using VMX optimization on memcmp() · c2a4e54e
    Simon Guo authored
    This patch is based on the previous VMX patch on memcmp().
    
    To optimize ppc64 memcmp() with VMX instruction, we need to think about
    the VMX penalty brought with: If kernel uses VMX instruction, it needs
    to save/restore current thread's VMX registers. There are 32 x 128 bits
    VMX registers in PPC, which means 32 x 16 = 512 bytes for load and store.
    
    The major concern regarding the memcmp() performance in kernel is KSM,
    who will use memcmp() frequently to merge identical pages. So it will
    make sense to take some measures/enhancement on KSM to see whether any
    improvement can be done here.  Cyril Bur indicates that the memcmp() for
    KSM has a higher possibility to fail (unmatch) early in previous bytes
    in following mail.
    	https://patchwork.ozlabs.org/patch/817322/#1773629
    And I am taking a follow-up on this with this patch.
    
    Per some testing, it shows KSM memcmp() will fail early at previous 32
    bytes.  More specifically:
        - 76% cases will fail/unmatch before 16 bytes;
        - 83% cases will fail/unmatch before 32 bytes;
        - 84% cases will fail/unmatch before 64 bytes;
    So 32 bytes looks a better choice than other bytes for pre-checking.
    
    The early failure is also true for memcmp() for non-KSM case. With a
    non-typical call load, it shows ~73% cases fail before first 32 bytes.
    
    This patch adds a 32 bytes pre-checking firstly before jumping into VMX
    operations, to avoid the unnecessary VMX penalty. It is not limited to
    KSM case. And the testing shows ~20% improvement on memcmp() average
    execution time with this patch.
    
    And note the 32B pre-checking is only performed when the compare size
    is long enough (>=4K currently) to allow VMX operation.
    
    The detail data and analysis is at:
    https://github.com/justdoitqd/publicFiles/blob/master/memcmp/README.mdSigned-off-by: default avatarSimon Guo <wei.guo.simon@gmail.com>
    Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
    c2a4e54e
memcmp_64.S 11.3 KB