• Ard Biesheuvel's avatar
    crypto: arm/aes-neonbs-ctr - deal with non-multiples of AES block size · c8bf850e
    Ard Biesheuvel authored
    Instead of falling back to C code to deal with the final bit of input
    that is not a round multiple of the block size, handle this in the asm
    code, permitting us to use overlapping loads and stores for performance,
    and implement the 16-byte wide XOR using a single NEON instruction.
    
    Since NEON loads and stores have a natural width of 16 bytes, we need to
    handle inputs of less than 16 bytes in a special way, but this rarely
    occurs in practice so it does not impact performance. All other input
    sizes can be consumed directly by the NEON asm code, although it should
    be noted that the core AES transform can still only process 128 bytes (8
    AES blocks) at a time.
    Signed-off-by: default avatarArd Biesheuvel <ardb@kernel.org>
    Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
    c8bf850e
aes-neonbs-core.S 22.1 KB