1. 04 Dec, 2020 4 commits
  2. 27 Nov, 2020 22 commits
  3. 20 Nov, 2020 13 commits
  4. 13 Nov, 2020 1 commit
    • Ard Biesheuvel's avatar
      crypto: arm64/chacha - simplify tail block handling · c4fc6328
      Ard Biesheuvel authored
      Based on lessons learnt from optimizing the 32-bit version of this driver,
      we can simplify the arm64 version considerably, by reordering the final
      two stores when the last block is not a multiple of 64 bytes. This removes
      the need to use permutation instructions to calculate the elements that are
      clobbered by the final overlapping store, given that the store of the
      penultimate block now follows it, and that one carries the correct values
      for those elements already.
      
      While at it, simplify the overlapping loads as well, by calculating the
      address of the final overlapping load upfront, and switching to this
      address for every load that would otherwise extend past the end of the
      source buffer.
      
      There is no impact on performance, but the resulting code is substantially
      smaller and easier to follow.
      
      Cc: Eric Biggers <ebiggers@google.com>
      Cc: "Jason A . Donenfeld" <Jason@zx2c4.com>
      Signed-off-by: default avatarArd Biesheuvel <ardb@kernel.org>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      c4fc6328