• Christophe Leroy's avatar
    powerpc/lib: optimise 32 bits __clear_user() · f36bbf21
    Christophe Leroy authored
    Rewrite clear_user() on the same principle as memset(0), making use
    of dcbz to clear complete cache lines.
    
    This code is a copy/paste of memset(), with some modifications
    in order to retrieve remaining number of bytes to be cleared,
    as it needs to be returned in case of error.
    
    On the same way as done on PPC64 in commit 17968fbb
    ("powerpc: 64bit optimised __clear_user"), the patch moves
    __clear_user() into a dedicated file string_32.S
    
    On a MPC885, throughput is almost doubled:
    
    Before:
    ~# dd if=/dev/zero of=/dev/null bs=1M count=1000
    1048576000 bytes (1000.0MB) copied, 18.990779 seconds, 52.7MB/s
    
    After:
    ~# dd if=/dev/zero of=/dev/null bs=1M count=1000
    1048576000 bytes (1000.0MB) copied, 9.611468 seconds, 104.0MB/s
    
    On a MPC8321, throughput is multiplied by 2.12:
    
    Before:
    root@vgoippro:~# dd if=/dev/zero of=/dev/null bs=1M count=1000
    1048576000 bytes (1000.0MB) copied, 6.844352 seconds, 146.1MB/s
    
    After:
    root@vgoippro:~# dd if=/dev/zero of=/dev/null bs=1M count=1000
    1048576000 bytes (1000.0MB) copied, 3.218854 seconds, 310.7MB/s
    Signed-off-by: default avatarChristophe Leroy <christophe.leroy@c-s.fr>
    Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
    f36bbf21
string_32.S 1.43 KB