- 19 Apr, 2024 8 commits
-
-
Adam Guerin authored
Improve error message to be more readable. Fixes: 5da6a2d5 ("crypto: qat - generate dynamically arbiter mappings") Signed-off-by: Adam Guerin <adam.guerin@intel.com> Reviewed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Eric Biggers authored
Instead of loading the message words into both MSG and \m0 and then adding the round constants to MSG, load the message words into \m0 and the round constants into MSG and then add \m0 to MSG. This shortens the source code slightly. It changes the instructions slightly, but it doesn't affect binary code size and doesn't seem to affect performance. Suggested-by: Stefan Kanthak <stefan.kanthak@nexgo.de> Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Eric Biggers authored
- Load the SHA-256 round constants relative to a pointer that points into the middle of the constants rather than to the beginning. Since x86 instructions use signed offsets, this decreases the instruction length required to access some of the later round constants. - Use punpcklqdq or punpckhqdq instead of longer instructions such as pshufd, pblendw, and palignr. This doesn't harm performance. The end result is that sha256_ni_transform shrinks from 839 bytes to 791 bytes, with no loss in performance. Suggested-by: Stefan Kanthak <stefan.kanthak@nexgo.de> Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Eric Biggers authored
MSGTMP[0-3] are used to hold the message schedule and are not temporary registers per se. MSGTMP4 is used as a temporary register for several different purposes and isn't really related to MSGTMP[0-3]. Rename them to MSG[0-3] and TMP accordingly. Also add a comment that clarifies what MSG is. Suggested-by: Stefan Kanthak <stefan.kanthak@nexgo.de> Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Eric Biggers authored
To avoid source code duplication, do the SHA-256 rounds using macros. This reduces the length of sha256_ni_asm.S by 153 lines while still producing the exact same object file. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Damian Muszynski authored
The Intel QAT driver provides support for the Diffie-Hellman (DH) algorithm, limited to prime numbers up to 4K. This driver is used by default on platforms with integrated QAT hardware for all DH requests. This has led to failures with algorithms requiring larger prime sizes, such as ffdhe6144. alg: ffdhe6144(dh): test failed on vector 1, err=-22 alg: self-tests for ffdhe6144(qat-dh) (ffdhe6144(dh)) failed (rc=-22) Implement a fallback mechanism when an unsupported request is received. Signed-off-by: Damian Muszynski <damian.muszynski@intel.com> Reviewed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Eric Biggers authored
Access the AES round keys using offsets -7*16 through 7*16, instead of 0*16 through 14*16. This allows VEX-encoded instructions to address all round keys using 1-byte offsets, whereas before some needed 4-byte offsets. This decreases the code size of aes-xts-avx-x86_64.o by 4.2%. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Chen Ni authored
Add check for dma_map_single() and return error if it fails in order to avoid invalid dma address. Fixes: e9297111 ("crypto: octeontx2 - add ctx_val workaround") Signed-off-by: Chen Ni <nichen@iscas.ac.cn> Reviewed-by: Bharat Bhushan <bbhushan2@marvell.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
- 12 Apr, 2024 32 commits
-
-
Eric Biggers authored
Make the non-AVX implementation of AES-XTS (xts-aes-aesni) use the new glue code that was introduced for the AVX implementations of AES-XTS. This reduces code size, and it improves the performance of xts-aes-aesni due to the optimization for messages that don't span page boundaries. This required moving the new glue functions higher up in the file and allowing the IV encryption function to be specified by the caller. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Lukas Wunner authored
Add a DEFINE_FREE() clause for x509_certificate structs and use it in x509_cert_parse() and x509_key_preparse(). These are the only functions where scope-based x509_certificate allocation currently makes sense. A third user will be introduced with the forthcoming SPDM library (Security Protocol and Data Model) for PCI device authentication. Unlike most other DEFINE_FREE() clauses, this one checks for IS_ERR() instead of NULL before calling x509_free_certificate() at end of scope. That's because the "constructor" of x509_certificate structs, x509_cert_parse(), returns a valid pointer or an ERR_PTR(), but never NULL. Comparing the Assembler output before/after has shown they are identical, save for the fact that gcc-12 always generates two return paths when __cleanup() is used, one for the success case and one for the error case. In x509_cert_parse(), add a hint for the compiler that kzalloc() never returns an ERR_PTR(). Otherwise the compiler adds a gratuitous IS_ERR() check on return. Introduce an assume() macro for this which can be re-used elsewhere in the kernel to provide hints for the compiler. Suggested-by: Jonathan Cameron <Jonathan.Cameron@Huawei.com> Link: https://lore.kernel.org/all/20231003153937.000034ca@Huawei.com/ Link: https://lwn.net/Articles/934679/Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Chenghai Huang authored
When the qm uninit command is executed, the err data needs to be released to prevent memory leakage. The error information release operation and uacce_remove are integrated in qm_remove_uacce. So add the qm_remove_uacce to qm uninit to avoid err memory leakage. Signed-off-by: Chenghai Huang <huangchenghai2@huawei.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Chenghai Huang authored
When dumping SQ, only the corresponding ID's SQE needs to be dumped, and there is no need to apply for the entire SQE memory. This is because excessive dump operations can lead to memory resource waste. Therefor apply for the space corresponding to sqe_id separately to avoid space waste. Signed-off-by: Chenghai Huang <huangchenghai2@huawei.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Chenghai Huang authored
The AIV is one of the SEC resources. When releasing resources, it need to release the AIV resources at the same time. Otherwise, memory leakage occurs. The aiv resource release is added to the sec resource release function. Signed-off-by: Chenghai Huang <huangchenghai2@huawei.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Chenghai Huang authored
There is a scenario where the file directory is created but the file memory is not set. In this case, if a user accesses the file, an error occurs. So during the creation process of debugfs, memory should be allocated first before creating the directory. In the release process, the directory should be deleted first before releasing the memory to avoid the situation where the memory does not exist when accessing the directory. In addition, the directory released by the debugfs is a global variable. When the debugfs of an accelerator fails to be initialized, releasing the directory of the global variable affects the debugfs initialization of other accelerators. The debugfs root directory released by debugfs init should be a member of qm, not a global variable. Signed-off-by: Chenghai Huang <huangchenghai2@huawei.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Chenghai Huang authored
The cmd type can be extended. Currently, only four types of cmd can be processed. Therefor, add the default processing branch to intercept incorrect parameter input. Signed-off-by: Chenghai Huang <huangchenghai2@huawei.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Chenghai Huang authored
There is a scenario where the file directory is created but the file attribute is not set. In this case, if a user accesses the file, an error occurs. So adjust the processing logic in the debugfs creation to prevent the file from being accessed before the file attributes such as the index are set. Signed-off-by: Chenghai Huang <huangchenghai2@huawei.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Chenghai Huang authored
The input parameter check in acc_get_sgl is redundant. The caller has been verified once. When the check is performed for multiple times, the performance deteriorates. So the redundant parameter verification is deleted, and the index verification is changed to the module entry function for verification. Signed-off-by: Chenghai Huang <huangchenghai2@huawei.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Chenghai Huang authored
During the zip probe process, the debugfs failure does not stop the probe. When debugfs initialization fails, jumping to the error branch will also release regs, in addition to its own rollback operation. As a result, it may be released repeatedly during the regs uninit process. Therefore, the null check needs to be added to the regs uninit process. Signed-off-by: Chenghai Huang <huangchenghai2@huawei.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Chenghai Huang authored
When CONFIG_PCI_IOV is disabled, the SRIOV configuration function is not required. An error occurs if this function is incorrectly called. Consistent with other modules, add the condition for configuring the sriov function of sec_pci_driver. Signed-off-by: Chenghai Huang <huangchenghai2@huawei.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Eric Biggers authored
Since sha512_transform_rorx() uses ymm registers, execute vzeroupper before returning from it. This is necessary to avoid reducing the performance of SSE code. Fixes: e01d69cb ("crypto: sha512 - Optimized SHA512 x86_64 assembly routine using AVX instructions.") Signed-off-by: Eric Biggers <ebiggers@google.com> Acked-by: Tim Chen <tim.c.chen@linux.intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Eric Biggers authored
Since sha256_transform_rorx() uses ymm registers, execute vzeroupper before returning from it. This is necessary to avoid reducing the performance of SSE code. Fixes: d34a4600 ("crypto: sha256 - Optimized sha256 x86_64 routine using AVX2's RORX instructions") Signed-off-by: Eric Biggers <ebiggers@google.com> Acked-by: Tim Chen <tim.c.chen@linux.intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Eric Biggers authored
Since nh_avx2() uses ymm registers, execute vzeroupper before returning from it. This is necessary to avoid reducing the performance of SSE code. Fixes: 0f961f9f ("crypto: x86/nhpoly1305 - add AVX2 accelerated NHPoly1305") Signed-off-by: Eric Biggers <ebiggers@google.com> Acked-by: Tim Chen <tim.c.chen@linux.intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Tom Zanussi authored
If some cpus are offlined, or if the node mask is smaller than expected, the 'nonexistent cpu' warning in rebalance_wq_table() may be erroneously triggered. Use cpumask_weight() to make sure we only iterate over the exact number of cpus in the mask. Also use num_possible_cpus() instead of num_online_cpus() to make sure all slots in the wq table are initialized. Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Stefan Berger authored
Enable the x509 parser to accept NIST P521 certificates and add the OID for ansip521r1, which is the identifier for NIST P521. Cc: David Howells <dhowells@redhat.com> Tested-by: Lukas Wunner <lukas@wunner.de> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Stefan Berger <stefanb@linux.ibm.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Stefan Berger authored
Adjust the calculation of the maximum signature size for support of NIST P521. While existing curves may prepend a 0 byte to their coordinates (to make the number positive), NIST P521 will not do this since only the first bit in the most significant byte is used. If the encoding of the x & y coordinates requires at least 128 bytes then an additional byte is needed for the encoding of the length. Take this into account when calculating the maximum signature size. Reviewed-by: Lukas Wunner <lukas@wunner.de> Tested-by: Lukas Wunner <lukas@wunner.de> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Stefan Berger <stefanb@linux.ibm.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Stefan Berger authored
Register NIST P521 as an akcipher and extend the testmgr with NIST P521-specific test vectors. Add a module alias so the module gets automatically loaded by the crypto subsystem when the curve is needed. Tested-by: Lukas Wunner <lukas@wunner.de> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Stefan Berger <stefanb@linux.ibm.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Stefan Berger authored
In cases where 'keylen' was referring to the size of the buffer used by a curve's digits, it does not reflect the purpose of the variable anymore once NIST P521 is used. What it refers to then is the size of the buffer, which may be a few bytes larger than the size a coordinate of a key. Therefore, rename keylen to bufsize where appropriate. Tested-by: Lukas Wunner <lukas@wunner.de> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Stefan Berger <stefanb@linux.ibm.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Stefan Berger authored
Replace the usage of ndigits with nbits where precise space calculations are needed, such as in ecdsa_max_size where the length of a coordinate is determined. Tested-by: Lukas Wunner <lukas@wunner.de> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Stefan Berger <stefanb@linux.ibm.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Stefan Berger authored
Add the parameters for the NIST P521 curve and define a new curve ID for it. Make the curve available in ecc_get_curve. Tested-by: Lukas Wunner <lukas@wunner.de> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Stefan Berger <stefanb@linux.ibm.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Stefan Berger authored
In ecc_point_mult use the number of bits of the NIST P521 curve + 2. The change is required specifically for NIST P521 to pass mathematical tests on the public key. Tested-by: Lukas Wunner <lukas@wunner.de> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Stefan Berger <stefanb@linux.ibm.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Stefan Berger authored
Implement vli_mmod_fast_521 following the description for how to calculate the modulus for NIST P521 in the NIST publication "Recommendations for Discrete Logarithm-Based Cryptography: Elliptic Curve Domain Parameters" section G.1.4. NIST p521 requires 9 64bit digits, so increase the ECC_MAX_DIGITS so that the vli digit array provides enough elements to fit the larger integers required by this curve. Tested-by: Lukas Wunner <lukas@wunner.de> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Stefan Berger <stefanb@linux.ibm.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Stefan Berger authored
Add the number of bits a curve has to the ecc_curve definition to be able to derive the number of bytes a curve requires for its coordinates from it. It also allows one to identify a curve by its particular size. Set the number of bits on all curve definitions. Tested-by: Lukas Wunner <lukas@wunner.de> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Stefan Berger <stefanb@linux.ibm.com> Reviewed-by: Vitaly Chikunov <vt@altlinux.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Stefan Berger authored
res.x has been calculated by ecc_point_mult_shamir, which uses 'mod curve_prime' on res.x and therefore p > res.x with 'p' being the curve_prime. Further, it is true that for the NIST curves p > n with 'n' being the 'curve_order' and therefore the following may be true as well: p > res.x >= n. If res.x >= n then res.x mod n can be calculated by iteratively sub- tracting n from res.x until res.x < n. For NIST P192/256/384 this can be done in a single subtraction. This can also be done in a single subtraction for NIST P521. The mathematical reason why a single subtraction is sufficient is due to the values of 'p' and 'n' of the NIST curves where the following holds true: note: max(res.x) = p - 1 max(res.x) - n < n p - 1 - n < n p - 1 < 2n => holds true for the NIST curves Tested-by: Lukas Wunner <lukas@wunner.de> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Stefan Berger <stefanb@linux.ibm.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Stefan Berger authored
In preparation for support of NIST P521, adjust the basic tests on the length of the provided key parameters to only ensure that the length of the x plus y coordinates parameter array is not an odd number and that each coordinate fits into an array of 'ndigits' digits. Mathematical tests on the key's parameters are then done in ecc_is_pubkey_valid_full rejecting invalid keys. The change is necessary since NIST P521 keys do not have keys with coordinates that each require 'full' digits (= all bits in u64 used). NIST P521 only requires 2 bytes (9 bits) in the most significant digit unlike NIST P192/256/384 that each require multiple 'full' digits. Tested-by: Lukas Wunner <lukas@wunner.de> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Tested-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Stefan Berger <stefanb@linux.ibm.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Stefan Berger authored
For NIST P192/256/384 the public key's x and y parameters could be copied directly from a given array since both parameters filled 'ndigits' of digits (a 'digit' is a u64). For support of NIST P521 the key parameters need to have leading zeros prepended to the most significant digit since only 2 bytes of the most significant digit are provided. Therefore, implement ecc_digits_from_bytes to convert a byte array into an array of digits and use this function in ecdsa_set_pub_key where an input byte array needs to be converted into digits. Suggested-by: Lukas Wunner <lukas@wunner.de> Tested-by: Lukas Wunner <lukas@wunner.de> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Stefan Berger <stefanb@linux.ibm.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Stefan Berger authored
Replace hard-coded numbers with ECC_CURVE_NIST_P192/256/384_DIGITS where possible. Tested-by: Lukas Wunner <lukas@wunner.de> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Stefan Berger <stefanb@linux.ibm.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Akhil R authored
Add support for Tegra Security Engine which can accelerate various crypto algorithms. The Engine has two separate instances within for AES and HASH algorithms respectively. The driver registers two crypto engines - one for AES and another for HASH algorithms and these operate independently and both uses the host1x bus. Additionally, it provides hardware-assisted key protection for up to 15 symmetric keys which it can use for the cipher operations. Signed-off-by: Akhil R <akhilrajeev@nvidia.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Akhil R authored
Add Tegra Security Engine details to the SID table in host1x driver. These entries are required to be in place to configure the stream ID for SE. Register writes to stream ID registers fail otherwise. Signed-off-by: Akhil R <akhilrajeev@nvidia.com> Acked-by: Mikko Perttunen <mperttunen@nvidia.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Akhil R authored
Add DT binding document for Tegra Security Engine. The AES and HASH algorithms are handled independently by separate engines within the Security Engine. These engines are registered as two separate crypto engine drivers. Signed-off-by: Akhil R <akhilrajeev@nvidia.com> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Herbert Xu authored
As the old padata code can execute in softirq context, disable softirqs for the new padata_do_mutithreaded code too as otherwise lockdep will get antsy. Reported-by: syzbot+0cb5bb0f4bf9e79db3b3@syzkaller.appspotmail.com Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: Daniel Jordan <daniel.m.jordan@oracle.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-