- 22 Mar, 2012 1 commit
-
-
Jussi Kivilinna authored
This caused conflict with twofish-x86_64-3way when compiled into kernel, same function names and not static. Reported-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi> Acked-by: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
- 14 Mar, 2012 9 commits
-
-
Steffen Klassert authored
When padata_do_parallel() is called from multiple cpus for the same padata instance, we can get object reordering on sequence number wrap because testing for sequence number wrap and reseting the sequence number must happen atomically but is implemented with two atomic operations. This patch fixes this by converting the sequence number from atomic_t to an unsigned int and protect the access with a spin_lock. As a side effect, we get rid of the sequence number wrap handling because the seqence number wraps back to null now without the need to do anything. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Steffen Klassert authored
When a padata object is queued to the serialization queue, another cpu might process and free the padata object. So don't dereference it after queueing to the serialization queue. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Jussi Kivilinna authored
Patch adds x86_64 assembler implementation of Camellia block cipher. Two set of functions are provided. First set is regular 'one-block at time' encrypt/decrypt functions. Second is 'two-block at time' functions that gain performance increase on out-of-order CPUs. Performance of 2-way functions should be equal to 1-way functions with in-order CPUs. Patch has been tested with tcrypt and automated filesystem tests. Tcrypt benchmark results: AMD Phenom II 1055T (fam:16, model:10): camellia-asm vs camellia_generic: 128bit key: (lrw:256bit) (xts:256bit) size ecb-enc ecb-dec cbc-enc cbc-dec ctr-enc ctr-dec lrw-enc lrw-dec xts-enc xts-dec 16B 1.27x 1.22x 1.30x 1.42x 1.30x 1.34x 1.19x 1.05x 1.23x 1.24x 64B 1.74x 1.79x 1.43x 1.87x 1.81x 1.87x 1.48x 1.38x 1.55x 1.62x 256B 1.90x 1.87x 1.43x 1.94x 1.94x 1.95x 1.63x 1.62x 1.67x 1.70x 1024B 1.96x 1.93x 1.43x 1.95x 1.98x 2.01x 1.67x 1.69x 1.74x 1.80x 8192B 1.96x 1.96x 1.39x 1.93x 2.01x 2.03x 1.72x 1.64x 1.71x 1.76x 256bit key: (lrw:384bit) (xts:512bit) size ecb-enc ecb-dec cbc-enc cbc-dec ctr-enc ctr-dec lrw-enc lrw-dec xts-enc xts-dec 16B 1.23x 1.23x 1.33x 1.39x 1.34x 1.38x 1.04x 1.18x 1.21x 1.29x 64B 1.72x 1.69x 1.42x 1.78x 1.81x 1.89x 1.57x 1.52x 1.56x 1.65x 256B 1.85x 1.88x 1.42x 1.86x 1.93x 1.96x 1.69x 1.65x 1.70x 1.75x 1024B 1.88x 1.86x 1.45x 1.95x 1.96x 1.95x 1.77x 1.71x 1.77x 1.78x 8192B 1.91x 1.86x 1.42x 1.91x 2.03x 1.98x 1.73x 1.71x 1.78x 1.76x camellia-asm vs aes-asm (8kB block): 128bit 256bit ecb-enc 1.15x 1.22x ecb-dec 1.16x 1.16x cbc-enc 0.85x 0.90x cbc-dec 1.20x 1.23x ctr-enc 1.28x 1.30x ctr-dec 1.27x 1.28x lrw-enc 1.12x 1.16x lrw-dec 1.08x 1.10x xts-enc 1.11x 1.15x xts-dec 1.14x 1.15x Intel Core2 T8100 (fam:6, model:23, step:6): camellia-asm vs camellia_generic: 128bit key: (lrw:256bit) (xts:256bit) size ecb-enc ecb-dec cbc-enc cbc-dec ctr-enc ctr-dec lrw-enc lrw-dec xts-enc xts-dec 16B 1.10x 1.12x 1.14x 1.16x 1.16x 1.15x 1.02x 1.02x 1.08x 1.08x 64B 1.61x 1.60x 1.17x 1.68x 1.67x 1.66x 1.43x 1.42x 1.44x 1.42x 256B 1.65x 1.73x 1.17x 1.77x 1.81x 1.80x 1.54x 1.53x 1.58x 1.54x 1024B 1.76x 1.74x 1.18x 1.80x 1.85x 1.85x 1.60x 1.59x 1.65x 1.60x 8192B 1.77x 1.75x 1.19x 1.81x 1.85x 1.86x 1.63x 1.61x 1.66x 1.62x 256bit key: (lrw:384bit) (xts:512bit) size ecb-enc ecb-dec cbc-enc cbc-dec ctr-enc ctr-dec lrw-enc lrw-dec xts-enc xts-dec 16B 1.10x 1.07x 1.13x 1.16x 1.11x 1.16x 1.03x 1.02x 1.08x 1.07x 64B 1.61x 1.62x 1.15x 1.66x 1.63x 1.68x 1.47x 1.46x 1.47x 1.44x 256B 1.71x 1.70x 1.16x 1.75x 1.69x 1.79x 1.58x 1.57x 1.59x 1.55x 1024B 1.78x 1.72x 1.17x 1.75x 1.80x 1.80x 1.63x 1.62x 1.65x 1.62x 8192B 1.76x 1.73x 1.17x 1.78x 1.80x 1.81x 1.64x 1.62x 1.68x 1.64x camellia-asm vs aes-asm (8kB block): 128bit 256bit ecb-enc 1.17x 1.21x ecb-dec 1.17x 1.20x cbc-enc 0.80x 0.82x cbc-dec 1.22x 1.24x ctr-enc 1.25x 1.26x ctr-dec 1.25x 1.26x lrw-enc 1.14x 1.18x lrw-dec 1.13x 1.17x xts-enc 1.14x 1.18x xts-dec 1.14x 1.17x Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Jussi Kivilinna authored
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Jussi Kivilinna authored
Fix checkpatch warnings before renaming file. Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Jussi Kivilinna authored
Rename camellia module to camellia_generic to allow optimized assembler implementations to autoload with module-alias. Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Jussi Kivilinna authored
Add tests for CTR, LRW and XTS modes. Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Jussi Kivilinna authored
New ECB, CBC, CTR, LRW and XTS test vectors for camellia. Larger ECB/CBC test vectors needed for parallel 2-way camellia implementation. Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Jussi Kivilinna authored
camellia_setup_tail() applies 'inverse of the last half of P-function' to subkeys, which is unneeded if keys are applied directly to yl/yr in CAMELLIA_ROUNDSM. Patch speeds up key setup and should speed up CAMELLIA_ROUNDSM as applying key to yl/yr early has less register dependencies. Quick tcrypt camellia results: x86_64, AMD Phenom II, ~5% faster x86_64, Intel Core 2, ~0.5% faster i386, Intel Atom N270, ~1% faster Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
- 25 Feb, 2012 6 commits
-
-
Jussi Kivilinna authored
x86 has fast unaligned accesses, so twofish-x86_64/i586 does not need to enforce alignment. Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Jussi Kivilinna authored
x86 has fast unaligned accesses, so blowfish-x86_64 does not need to enforce alignment. Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Jussi Kivilinna authored
Driver name in ablk_*_init functions can be constructed runtime. Therefore use single function ablk_init to reduce object size. Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Jussi Kivilinna authored
Combine all crypto_alg to be registered and use new crypto_[un]register_algs functions. Simplifies init/exit code and reduce object size. Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Jussi Kivilinna authored
Combine all crypto_alg to be registered and use new crypto_[un]register_algs functions. Simplifies init/exit code and reduce object size. Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Jussi Kivilinna authored
Combine all crypto_alg to be registered and use new crypto_[un]register_algs functions. Simplifies init/exit code and reduce object size. Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
- 16 Feb, 2012 2 commits
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6Herbert Xu authored
Merge crypto tree as it has cherry-picked the ror64 patch from cryptodev.
-
Alexey Dobriyan authored
Use standard ror64() instead of hand-written. There is no standard ror64, so create it. The difference is shift value being "unsigned int" instead of uint64_t (for which there is no reason). gcc starts to emit native ROR instructions which it doesn't do for some reason currently. This should make the code faster. Patch survives in-tree crypto test and ping flood with hmac(sha512) on. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
- 14 Feb, 2012 2 commits
-
-
Jesper Juhl authored
We cannot reach the line after 'return err'. Remove it. Signed-off-by: Jesper Juhl <jj@chaosbits.net> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Jesper Juhl authored
We can never reach the line just after the 'return 0' statement. Remove it. Signed-off-by: Jesper Juhl <jj@chaosbits.net> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
- 05 Feb, 2012 2 commits
-
-
Jesper Juhl authored
We declare 'exact' without initializing it and then do: [...] if (strlen(p->cru_driver_name)) exact = 1; if (priority && !exact) return -EINVAL; [...] If the first 'if' is not true, then the second will test an uninitialized 'exact'. As far as I can tell, what we want is for 'exact' to be initialized to 0 (zero/false). Signed-off-by: Jesper Juhl <jj@chaosbits.net> Acked-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Herbert Xu authored
Unfortunately in reducing W from 80 to 16 we ended up unrolling the loop twice. As gcc has issues dealing with 64-bit ops on i386 this means that we end up using even more stack space (>1K). This patch solves the W reduction by moving LOAD_OP/BLEND_OP into the loop itself, thus avoiding the need to duplicate it. While the stack space still isn't great (>0.5K) it is at least in the same ball park as the amount of stack used for our C sha1 implementation. Note that this patch basically reverts to the original code so the diff looks bigger than it really is. Cc: stable@vger.kernel.org Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
- 26 Jan, 2012 3 commits
-
-
Herbert Xu authored
The previous patch used the modulus operator over a power of 2 unnecessarily which may produce suboptimal binary code. This patch changes changes them to binary ands instead. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Kim Phillips authored
drivers/crypto/caam/ctrl.c: In function 'caam_probe': drivers/crypto/caam/ctrl.c:49:6: warning: unused variable 'd' [-Wunused-variable] Signed-off-by: Kim Phillips <kim.phillips@freescale.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Mark Brown authored
Hardware crypto engines frequently need to register a selection of different algorithms with the core. Simplify their code slightly, especially the error handling, by providing functions to register a number of algorithms in a single call. Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
- 15 Jan, 2012 3 commits
-
-
Alexey Dobriyan authored
Use standard ror64() instead of hand-written. There is no standard ror64, so create it. The difference is shift value being "unsigned int" instead of uint64_t (for which there is no reason). gcc starts to emit native ROR instructions which it doesn't do for some reason currently. This should make the code faster. Patch survives in-tree crypto test and ping flood with hmac(sha512) on. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Alexey Dobriyan authored
For rounds 16--79, W[i] only depends on W[i - 2], W[i - 7], W[i - 15] and W[i - 16]. Consequently, keeping all W[80] array on stack is unnecessary, only 16 values are really needed. Using W[16] instead of W[80] greatly reduces stack usage (~750 bytes to ~340 bytes on x86_64). Line by line explanation: * BLEND_OP array is "circular" now, all indexes have to be modulo 16. Round number is positive, so remainder operation should be without surprises. * initial full message scheduling is trimmed to first 16 values which come from data block, the rest is calculated before it's needed. * original loop body is unrolled version of new SHA512_0_15 and SHA512_16_79 macros, unrolling was done to not do explicit variable renaming. Otherwise it's the very same code after preprocessing. See sha1_transform() code which does the same trick. Patch survives in-tree crypto test and original bugreport test (ping flood with hmac(sha512). See FIPS 180-2 for SHA-512 definition http://csrc.nist.gov/publications/fips/fips180-2/fips180-2withchangenotice.pdfSigned-off-by: Alexey Dobriyan <adobriyan@gmail.com> Cc: stable@vger.kernel.org Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Alexey Dobriyan authored
commit f9e2bca6 aka "crypto: sha512 - Move message schedule W[80] to static percpu area" created global message schedule area. If sha512_update will ever be entered twice, hash will be silently calculated incorrectly. Probably the easiest way to notice incorrect hashes being calculated is to run 2 ping floods over AH with hmac(sha512): #!/usr/sbin/setkey -f flush; spdflush; add IP1 IP2 ah 25 -A hmac-sha512 0x00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000025; add IP2 IP1 ah 52 -A hmac-sha512 0x00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000052; spdadd IP1 IP2 any -P out ipsec ah/transport//require; spdadd IP2 IP1 any -P in ipsec ah/transport//require; XfrmInStateProtoError will start ticking with -EBADMSG being returned from ah_input(). This never happens with, say, hmac(sha1). With patch applied (on BOTH sides), XfrmInStateProtoError does not tick with multiple bidirectional ping flood streams like it doesn't tick with SHA-1. After this patch sha512_transform() will start using ~750 bytes of stack on x86_64. This is OK for simple loads, for something more heavy, stack reduction will be done separatedly. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Cc: stable@vger.kernel.org Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
- 13 Jan, 2012 10 commits
-
-
Kim Phillips authored
sha224 and 384 support extends caam noise to 21 lines. Do the same as commit 5b859b6 "crypto: talitos - be less noisy on startup", but for caam, and display: caam ffe300000.crypto: fsl,sec-v4.0 algorithms registered in /proc/crypto Signed-off-by: Kim Phillips <kim.phillips@freescale.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Hemant Agrawal authored
Signed-off-by: Hemant Agrawal <hemant@freescale.com> Signed-off-by: Kim Phillips <kim.phillips@freescale.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Julia Lawall authored
The function is called with locks held and thus should not use GFP_KERNEL. The semantic patch that makes this report is available in scripts/coccinelle/locks/call_kern.cocci. More information about semantic patching is available at http://coccinelle.lip6.fr/Signed-off-by: Julia Lawall <julia.lawall@lip6.fr> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Nikos Mavrogiannopoulos authored
The added CRYPTO_ALG_KERN_DRIVER_ONLY indicates whether a cipher is only available via a kernel driver. If the cipher implementation might be available by using an instruction set or by porting the kernel code, then it must not be set. Signed-off-by: Nikos Mavrogiannopoulos <nmav@gnutls.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Julia Lawall authored
Reimplement a call to devm_request_mem_region followed by a call to ioremap or ioremap_nocache by a call to devm_request_and_ioremap. The semantic patch that makes this transformation is as follows: (http://coccinelle.lip6.fr/) // <smpl> @nm@ expression myname; identifier i; @@ struct platform_driver i = { .driver = { .name = myname } }; @@ expression dev,res,size; expression nm.myname; @@ -if (!devm_request_mem_region(dev, res->start, size, - \(res->name\|dev_name(dev)\|myname\))) { - ... - return ...; -} ... when != res->start ( -devm_ioremap(dev,res->start,size) +devm_request_and_ioremap(dev,res) | -devm_ioremap_nocache(dev,res->start,size) +devm_request_and_ioremap(dev,res) ) ... when any when != res->start // </smpl> Signed-off-by: Julia Lawall <julia@diku.dk> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Jussi Kivilinna authored
Matrix transpose macro in serpent-sse2 uses mix of SSE2 integer and SSE floating point instructions, which might cause performance penality on some CPUs. This patch replaces transpose_4x4 macro with version that uses only SSE2 integer instructions. Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Jussi Kivilinna authored
Implementation in blowfish-x86_64 uses 64bit rotations which are slow on P4, making blowfish-x86_64 slower than generic C implementation. Therefore blacklist P4. Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Jussi Kivilinna authored
Performance of twofish-x86_64-3way on Intel Pentium 4 and Atom is lower than of twofish-x86_64 module. So blacklist these CPUs. Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Varun Wadekar authored
driver supports ecb/cbc/ofb/ansi_x9.31rng modes, 128, 192 and 256-bit key sizes Signed-off-by: Varun Wadekar <vwadekar@nvidia.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Henning Heinold authored
The crypto driver will need this api to use it in the RNG calculations. In order to build the crypto driver as a module, tegra_chip_uid has to be exported. Acked-by: Olof Johansson <olof@lixom.net> Signed-off-by: Henning Heinold <heinold@inf.fu-berlin.de> Signed-off-by: Varun Wadekar <vwadekar@nvidia.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
- 11 Jan, 2012 2 commits
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6Linus Torvalds authored
* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (54 commits) crypto: gf128mul - remove leftover "(EXPERIMENTAL)" in Kconfig crypto: serpent-sse2 - remove unneeded LRW/XTS #ifdefs crypto: serpent-sse2 - select LRW and XTS crypto: twofish-x86_64-3way - remove unneeded LRW/XTS #ifdefs crypto: twofish-x86_64-3way - select LRW and XTS crypto: xts - remove dependency on EXPERIMENTAL crypto: lrw - remove dependency on EXPERIMENTAL crypto: picoxcell - fix boolean and / or confusion crypto: caam - remove DECO access initialization code crypto: caam - fix polarity of "propagate error" logic crypto: caam - more desc.h cleanups crypto: caam - desc.h - convert spaces to tabs crypto: talitos - convert talitos_error to struct device crypto: talitos - remove NO_IRQ references crypto: talitos - fix bad kfree crypto: convert drivers/crypto/* to use module_platform_driver() char: hw_random: convert drivers/char/hw_random/* to use module_platform_driver() crypto: serpent-sse2 - should select CRYPTO_CRYPTD crypto: serpent - rename serpent.c to serpent_generic.c crypto: serpent - cleanup checkpatch errors and warnings ...
-
git://selinuxproject.org/~jmorris/linux-securityLinus Torvalds authored
* 'for-linus' of git://selinuxproject.org/~jmorris/linux-security: (32 commits) ima: fix invalid memory reference ima: free duplicate measurement memory security: update security_file_mmap() docs selinux: Casting (void *) value returned by kmalloc is useless apparmor: fix module parameter handling Security: tomoyo: add .gitignore file tomoyo: add missing rcu_dereference() apparmor: add missing rcu_dereference() evm: prevent racing during tfm allocation evm: key must be set once during initialization mpi/mpi-mpow: NULL dereference on allocation failure digsig: build dependency fix KEYS: Give key types their own lockdep class for key->sem TPM: fix transmit_cmd error logic TPM: NSC and TIS drivers X86 dependency fix TPM: Export wait_for_stat for other vendor specific drivers TPM: Use vendor specific function for status probe tpm_tis: add delay after aborting command tpm_tis: Check return code from getting timeouts/durations tpm: Introduce function to poll for result of self test ... Fix up trivial conflict in lib/Makefile due to addition of CONFIG_MPI and SIGSIG next to CONFIG_DQL addition.
-