- 24 Feb, 2024 3 commits
-
-
Mario Limonciello authored
Errors can potentially occur in the "processing" of PSP commands or commands can be processed successfully but still return an error code in the header. This second case was being discarded because PSP communication worked but the command returned an error code in the payload header. Capture both cases and return them to the caller as -EIO for the caller to investigate. The caller can detect the latter by looking at `req->header->status`. Reported-and-tested-by: Tim Van Patten <timvp@google.com> Fixes: 7ccc4f4e ("crypto: ccp - Add support for an interface for platform features") Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Arnd Bergmann authored
clang-16 warns about casting between incompatible function types: arch/arm/crypto/sha256_glue.c:37:5: error: cast from 'void (*)(u32 *, const void *, unsigned int)' (aka 'void (*)(unsigned int *, const void *, unsigned int)') to 'sha256_block_fn *' (aka 'void (*)(struct sha256_state *, const unsigned char *, int)') converts to incompatible function type [-Werror,-Wcast-function-type-strict] 37 | (sha256_block_fn *)sha256_block_data_order); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ arch/arm/crypto/sha512-glue.c:34:3: error: cast from 'void (*)(u64 *, const u8 *, int)' (aka 'void (*)(unsigned long long *, const unsigned char *, int)') to 'sha512_block_fn *' (aka 'void (*)(struct sha512_state *, const unsigned char *, int)') converts to incompatible function type [-Werror,-Wcast-function-type-strict] 34 | (sha512_block_fn *)sha512_block_data_order); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Fix the prototypes for the assembler functions to match the typedef. The code already relies on the digest being the first part of the state structure, so there is no change in behavior. Fixes: c80ae7ca ("crypto: arm/sha512 - accelerated SHA-512 using ARM generic ASM and NEON") Fixes: b59e2ae3 ("crypto: arm/sha256 - move SHA-224/256 ASM/NEON implementation to base layer") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Giovanni Cabiddu authored
Remove unneeded colon in the auto_reset section. This resolves the following errors when building the documentation: Documentation/ABI/testing/sysfs-driver-qat:146: ERROR: Unexpected indentation. Documentation/ABI/testing/sysfs-driver-qat:146: WARNING: Block quote ends without a blank line; unexpected unindent. Fixes: f5419a42 ("crypto: qat - add auto reset on error") Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Closes: https://lore.kernel.org/linux-kernel/20240212144830.70495d07@canb.auug.org.au/T/Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
- 17 Feb, 2024 7 commits
-
-
Damian Muszynski authored
During the PCI AER system's error recovery process, the kernel driver may encounter a race condition with freeing the reset_data structure's memory. If the device restart will take more than 10 seconds the function scheduling that restart will exit due to a timeout, and the reset_data structure will be freed. However, this data structure is used for completion notification after the restart is completed, which leads to a UAF bug. This results in a KFENCE bug notice. BUG: KFENCE: use-after-free read in adf_device_reset_worker+0x38/0xa0 [intel_qat] Use-after-free read at 0x00000000bc56fddf (in kfence-#142): adf_device_reset_worker+0x38/0xa0 [intel_qat] process_one_work+0x173/0x340 To resolve this race condition, the memory associated to the container of the work_struct is freed on the worker if the timeout expired, otherwise on the function that schedules the worker. The timeout detection can be done by checking if the caller is still waiting for completion or not by using completion_done() function. Fixes: d8cba25d ("crypto: qat - Intel(R) QAT driver framework") Cc: <stable@vger.kernel.org> Signed-off-by: Damian Muszynski <damian.muszynski@intel.com> Reviewed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Damian Muszynski authored
The implementation of the Rate Limiting (RL) feature includes the cleanup of all SLAs during device shutdown. For each SLA, the firmware is notified of the removal through an admin message, the data structures that take into account the budgets are updated and the memory is freed. However, this explicit cleanup is not necessary as (1) the device is reset, and the firmware state is lost and (2) all RL data structures are freed anyway. In addition, if the device is unresponsive, for example after a PCI AER error is detected, the admin interface might not be available. This might slow down the shutdown sequence and cause a timeout in the recovery flows which in turn makes the driver believe that the device is not recoverable. Fix by replacing the explicit SLAs removal with just a free of the SLA data structures. Fixes: d9fb8408 ("crypto: qat - add rate limiting feature to qat_4xxx") Cc: <stable@vger.kernel.org> Signed-off-by: Damian Muszynski <damian.muszynski@intel.com> Reviewed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Lukas Bulwahn authored
Commit 10930333 ("crypto: vmx - Move to arch/powerpc/crypto") moves the crypto vmx files to arch/powerpc, but misses to adjust the file entries for IBM Power VMX Cryptographic instructions and LINUX FOR POWERPC. Hence, ./scripts/get_maintainer.pl --self-test=patterns complains about broken references. Adjust these file entries accordingly. To keep the matched files exact after the movement, spell out each file name in the new directory. Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Weili Qian authored
The function qm_stop_qp_nolock() always return zero, so function type is changed to void. Signed-off-by: Weili Qian <qianweili@huawei.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Weili Qian authored
The debugfs files 'dev_state' and 'dev_timeout' are added. Users can query the current queue stop status through these two files. And set the waiting timeout when the queue is released. dev_state: if dev_timeout is set, dev_state indicates the status of stopping the queue. 0 indicates that the queue is stopped successfully. Other values indicate that the queue stops fail. If dev_timeout is not set, the value of dev_state is 0; dev_timeout: if the queue fails to stop, the queue is released after waiting dev_timeout * 20ms. Signed-off-by: Weili Qian <qianweili@huawei.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Weili Qian authored
Hardware V3 could be able to drain function by sending mailbox to hardware which will trigger tasks in device to be flushed out. When the function is reset, the function can be stopped by this way. Signed-off-by: Weili Qian <qianweili@huawei.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Borislav Petkov (AMD) authored
In the case when only TSME is enabled, it is useful to state that fact too, so that users are aware that memory encryption is still enabled even when the corresponding software variant of memory encryption is not enabled. Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Acked-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
- 09 Feb, 2024 12 commits
-
-
Joachim Vandersmissen authored
SP 800-56Br2, Section 7.1.1 [1] specifies that: 1. If m does not satisfy 1 < m < (n – 1), output an indication that m is out of range, and exit without further processing. Similarly, Section 7.1.2 of the same standard specifies that: 1. If the ciphertext c does not satisfy 1 < c < (n – 1), output an indication that the ciphertext is out of range, and exit without further processing. This range is slightly more conservative than RFC3447, as it also excludes RSA fixed points 0, 1, and n - 1. [1] https://doi.org/10.6028/NIST.SP.800-56Br2Signed-off-by: Joachim Vandersmissen <git@jvdsn.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Mun Chun Yep authored
Rework the AER reset and recovery flow to take into account root port integrated devices that gets reset between the error detected and the slot reset callbacks. In adf_error_detected() the devices is gracefully shut down. The worker threads are disabled, the error conditions are notified to listeners and through PFVF comms and finally the device is reset as part of adf_dev_down(). In adf_slot_reset(), the device is brought up again. If SRIOV VFs were enabled before reset, these are re-enabled and VFs are notified of restarting through PFVF comms. Signed-off-by: Mun Chun Yep <mun.chun.yep@intel.com> Reviewed-by: Ahsan Atta <ahsan.atta@intel.com> Reviewed-by: Markas Rapoportas <markas.rapoportas@intel.com> Reviewed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Furong Zhou authored
When the driver detects an heartbeat failure, it starts the recovery flow. Set a limit so that the number of events is limited in case the heartbeat status is read too frequently. Signed-off-by: Furong Zhou <furong.zhou@intel.com> Reviewed-by: Ahsan Atta <ahsan.atta@intel.com> Reviewed-by: Markas Rapoportas <markas.rapoportas@intel.com> Reviewed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Signed-off-by: Mun Chun Yep <mun.chun.yep@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Damian Muszynski authored
Expose the `auto_reset` sysfs attribute to configure the driver to reset the device when a fatal error is detected. When auto reset is enabled, the driver resets the device when it detects either an heartbeat failure or a fatal error through an interrupt. This patch is based on earlier work done by Shashank Gupta. Signed-off-by: Damian Muszynski <damian.muszynski@intel.com> Reviewed-by: Ahsan Atta <ahsan.atta@intel.com> Reviewed-by: Markas Rapoportas <markas.rapoportas@intel.com> Reviewed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Signed-off-by: Mun Chun Yep <mun.chun.yep@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Mun Chun Yep authored
Notify a fatal error condition and optionally reset the device in the following cases: * if the device reports an uncorrectable fatal error through an interrupt * if the heartbeat feature detects that the device is not responding This patch is based on earlier work done by Shashank Gupta. Signed-off-by: Mun Chun Yep <mun.chun.yep@intel.com> Reviewed-by: Ahsan Atta <ahsan.atta@intel.com> Reviewed-by: Markas Rapoportas <markas.rapoportas@intel.com> Reviewed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Mun Chun Yep authored
When a Physical Function (PF) is reset, SR-IOV gets disabled, making the associated Virtual Functions (VFs) unavailable. Even after reset and using pci_restore_state, VFs remain uncreated because the numvfs still at 0. Therefore, it's necessary to reconfigure SR-IOV to re-enable VFs. This commit introduces the ADF_SRIOV_ENABLED configuration flag to cache the SR-IOV enablement state. SR-IOV is only re-enabled if it was previously configured. This commit also introduces a dedicated workqueue without `WQ_MEM_RECLAIM` flag for enabling SR-IOV during Heartbeat and CPM error resets, preventing workqueue flushing warning. This patch is based on earlier work done by Shashank Gupta. Signed-off-by: Mun Chun Yep <mun.chun.yep@intel.com> Reviewed-by: Ahsan Atta <ahsan.atta@intel.com> Reviewed-by: Markas Rapoportas <markas.rapoportas@intel.com> Reviewed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Mun Chun Yep authored
Update the PFVF logic to handle restart and recovery. This adds the following functions: * adf_pf2vf_notify_fatal_error(): allows the PF to notify VFs that the device detected a fatal error and requires a reset. This sends to VF the event `ADF_PF2VF_MSGTYPE_FATAL_ERROR`. * adf_pf2vf_wait_for_restarting_complete(): allows the PF to wait for `ADF_VF2PF_MSGTYPE_RESTARTING_COMPLETE` events from active VFs before proceeding with a reset. * adf_pf2vf_notify_restarted(): enables the PF to notify VFs with an `ADF_PF2VF_MSGTYPE_RESTARTED` event after recovery, indicating that the device is back to normal. This prompts VF drivers switch back to use the accelerator for workload processing. These changes improve the communication and synchronization between PF and VF drivers during system restart and recovery processes. Signed-off-by: Mun Chun Yep <mun.chun.yep@intel.com> Reviewed-by: Ahsan Atta <ahsan.atta@intel.com> Reviewed-by: Markas Rapoportas <markas.rapoportas@intel.com> Reviewed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Furong Zhou authored
Disable arbitration to avoid new requests to be processed before resetting a device. This is needed so that new requests are not fetched when an error is detected. Signed-off-by: Furong Zhou <furong.zhou@intel.com> Reviewed-by: Ahsan Atta <ahsan.atta@intel.com> Reviewed-by: Markas Rapoportas <markas.rapoportas@intel.com> Reviewed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Signed-off-by: Mun Chun Yep <mun.chun.yep@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Furong Zhou authored
Add error notify method to report a fatal error event to all the subsystems registered. In addition expose an API, adf_notify_fatal_error(), that allows to trigger a fatal error notification asynchronously in the context of a workqueue. This will be invoked when a fatal error is detected by the ISR or through Heartbeat. Signed-off-by: Furong Zhou <furong.zhou@intel.com> Reviewed-by: Ahsan Atta <ahsan.atta@intel.com> Reviewed-by: Markas Rapoportas <markas.rapoportas@intel.com> Reviewed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Signed-off-by: Mun Chun Yep <mun.chun.yep@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Damian Muszynski authored
Add a mechanism that allows to inject a heartbeat error for testing purposes. A new attribute `inject_error` is added to debugfs for each QAT device. Upon a write on this attribute, the driver will inject an error on the device which can then be detected by the heartbeat feature. Errors are breaking the device functionality thus they require a device reset in order to be recovered. This functionality is not compiled by default, to enable it CRYPTO_DEV_QAT_ERROR_INJECTION must be set. Signed-off-by: Damian Muszynski <damian.muszynski@intel.com> Reviewed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Reviewed-by: Lucas Segarra Fernandez <lucas.segarra.fernandez@intel.com> Reviewed-by: Ahsan Atta <ahsan.atta@intel.com> Reviewed-by: Markas Rapoportas <markas.rapoportas@intel.com> Signed-off-by: Mun Chun Yep <mun.chun.yep@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Li RongQing authored
virtqueue_enable_cb() will call virtqueue_poll() which will check if queue is broken at beginning, so remove the virtqueue_is_broken() call Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Li RongQing <lirongqing@baidu.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Quanyang Wang authored
When calling crypto_finalize_request, BH should be disabled to avoid triggering the following calltrace: ------------[ cut here ]------------ WARNING: CPU: 2 PID: 74 at crypto/crypto_engine.c:58 crypto_finalize_request+0xa0/0x118 Modules linked in: cryptodev(O) CPU: 2 PID: 74 Comm: firmware:zynqmp Tainted: G O 6.8.0-rc1-yocto-standard #323 Hardware name: ZynqMP ZCU102 Rev1.0 (DT) pstate: 40000005 (nZcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : crypto_finalize_request+0xa0/0x118 lr : crypto_finalize_request+0x104/0x118 sp : ffffffc085353ce0 x29: ffffffc085353ce0 x28: 0000000000000000 x27: ffffff8808ea8688 x26: ffffffc081715038 x25: 0000000000000000 x24: ffffff880100db00 x23: ffffff880100da80 x22: 0000000000000000 x21: 0000000000000000 x20: ffffff8805b14000 x19: ffffff880100da80 x18: 0000000000010450 x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000 x14: 0000000000000003 x13: 0000000000000000 x12: ffffff880100dad0 x11: 0000000000000000 x10: ffffffc0832dcd08 x9 : ffffffc0812416d8 x8 : 00000000000001f4 x7 : ffffffc0830d2830 x6 : 0000000000000001 x5 : ffffffc082091000 x4 : ffffffc082091658 x3 : 0000000000000000 x2 : ffffffc7f9653000 x1 : 0000000000000000 x0 : ffffff8802d20000 Call trace: crypto_finalize_request+0xa0/0x118 crypto_finalize_aead_request+0x18/0x30 zynqmp_handle_aes_req+0xcc/0x388 crypto_pump_work+0x168/0x2d8 kthread_worker_fn+0xfc/0x3a0 kthread+0x118/0x138 ret_from_fork+0x10/0x20 irq event stamp: 40 hardirqs last enabled at (39): [<ffffffc0812416f8>] _raw_spin_unlock_irqrestore+0x70/0xb0 hardirqs last disabled at (40): [<ffffffc08122d208>] el1_dbg+0x28/0x90 softirqs last enabled at (36): [<ffffffc080017dec>] kernel_neon_begin+0x8c/0xf0 softirqs last disabled at (34): [<ffffffc080017dc0>] kernel_neon_begin+0x60/0xf0 ---[ end trace 0000000000000000 ]--- Fixes: 4d96f7d4 ("crypto: xilinx - Add Xilinx AES driver") Signed-off-by: Quanyang Wang <quanyang.wang@windriver.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
- 02 Feb, 2024 5 commits
-
-
Eric Biggers authored
Since crypto_hash_alg_has_setkey() is only called from ahash.c itself, make it a static function. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Wenkai Lin authored
Unused parameter of static functions should be removed. Signed-off-by: Wenkai Lin <linwenkai6@hisilicon.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Qi Tao authored
This patch fixes following cleanup issues: - The return value of the function is inconsistent with the actual return type. - After the pointer type is directly converted to the `__le64` type, the program may crash or produce unexpected results. Signed-off-by: Qi Tao <taoqi10@huawei.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Qi Tao authored
Nested macros are integrated into a single macro, making the code simpler. Signed-off-by: Qi Tao <taoqi10@huawei.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Qi Tao authored
As the sec DFX function is enhanced, some RAS registers are added to the original DFX registers to enhance the DFX positioning function. Signed-off-by: Qi Tao <taoqi10@huawei.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
- 26 Jan, 2024 13 commits
-
-
David Wronek authored
Document the compatible used for the inline crypto engine found on SC7180. Acked-by: Rob Herring <robh@kernel.org> Signed-off-by: David Wronek <davidwronek@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Joachim Vandersmissen authored
Commit a93492ca ("crypto: ccree - remove data unit size support") removed support for the xts512 and xts4096 algorithms, but left them defined in testmgr.c. This patch removes those definitions. Signed-off-by: Joachim Vandersmissen <git@jvdsn.com> Acked-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Erick Archer authored
As noted in the "Deprecated Interfaces, Language Features, Attributes, and Conventions" documentation [1], size calculations (especially multiplication) should not be performed in memory allocator (or similar) function arguments due to the risk of them overflowing. This could lead to values wrapping around and a smaller allocation being made than the caller was expecting. Using those allocations could lead to linear overflows of heap memory and other misbehaviors. So, use the purpose specific kcalloc_node() function instead of the argument count * size in the kzalloc_node() function. Link: https://www.kernel.org/doc/html/next/process/deprecated.html#open-coded-arithmetic-in-allocator-arguments [1] Link: https://github.com/KSPP/linux/issues/162Signed-off-by: Erick Archer <erick.archer@gmx.com> Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org> Acked-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Erick Archer authored
As noted in the "Deprecated Interfaces, Language Features, Attributes, and Conventions" documentation [1], size calculations (especially multiplication) should not be performed in memory allocator (or similar) function arguments due to the risk of them overflowing. This could lead to values wrapping around and a smaller allocation being made than the caller was expecting. Using those allocations could lead to linear overflows of heap memory and other misbehaviors. So, use the purpose specific kcalloc() function instead of the argument size * count in the kzalloc() function. Link: https://www.kernel.org/doc/html/next/process/deprecated.html#open-coded-arithmetic-in-allocator-arguments [1] Link: https://github.com/KSPP/linux/issues/162Signed-off-by: Erick Archer <erick.archer@gmx.com> Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org> Acked-by: Jernej Skrabec <jernej.skrabec@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Wenkai Lin authored
Switch to raw_smp_processor_id() to prevent a number of warnings from kernel debugging. We do not care about preemption here, as the CPU number is only used as a poor mans load balancing or device selection. If preemption happens during an encrypt/decrypt operation a small performance hit will occur but everything will continue to work, so just ignore it. This commit is similar to e7a9b05c ("crypto: cavium - Fix smp_processor_id() warnings"). [ 7538.874350] BUG: using smp_processor_id() in preemptible [00000000] code: af_alg06/8438 [ 7538.874368] caller is debug_smp_processor_id+0x1c/0x28 [ 7538.874373] CPU: 50 PID: 8438 Comm: af_alg06 Kdump: loaded Not tainted 5.10.0.pc+ #18 [ 7538.874377] Call trace: [ 7538.874387] dump_backtrace+0x0/0x210 [ 7538.874389] show_stack+0x2c/0x38 [ 7538.874392] dump_stack+0x110/0x164 [ 7538.874394] check_preemption_disabled+0xf4/0x108 [ 7538.874396] debug_smp_processor_id+0x1c/0x28 [ 7538.874406] sec_create_qps+0x24/0xe8 [hisi_sec2] [ 7538.874408] sec_ctx_base_init+0x20/0x4d8 [hisi_sec2] [ 7538.874411] sec_aead_ctx_init+0x68/0x180 [hisi_sec2] [ 7538.874413] sec_aead_sha256_ctx_init+0x28/0x38 [hisi_sec2] [ 7538.874421] crypto_aead_init_tfm+0x54/0x68 [ 7538.874423] crypto_create_tfm_node+0x6c/0x110 [ 7538.874424] crypto_alloc_tfm_node+0x74/0x288 [ 7538.874426] crypto_alloc_aead+0x40/0x50 [ 7538.874431] aead_bind+0x50/0xd0 [ 7538.874433] alg_bind+0x94/0x148 [ 7538.874439] __sys_bind+0x98/0x118 [ 7538.874441] __arm64_sys_bind+0x28/0x38 [ 7538.874445] do_el0_svc+0x88/0x258 [ 7538.874447] el0_svc+0x1c/0x28 [ 7538.874449] el0_sync_handler+0x8c/0xb8 [ 7538.874452] el0_sync+0x148/0x180 Signed-off-by: Wenkai Lin <linwenkai6@hisilicon.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Ard Biesheuvel authored
The C glue code already infers whether or not the current iteration is the final one, by comparing walk.nbytes with walk.total. This means we can easily inform the asm helpers of this as well, by conditionally passing a pointer to the original IV, which is used in the finalization of the MAC. This removes the need for a separate call into the asm code to perform the finalization. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Ard Biesheuvel authored
The encryption and decryption code paths are mostly identical, except for a small difference where the plaintext input into the MAC is taken from either the input or the output block. We can factor this in quite easily using a vector bit select, and a few additional XORs, without the need for branches. This way, we can use the same tail handling logic on the encrypt and decrypt code paths, allowing further consolidation of the asm helpers in a subsequent patch. (In the main loop, adding just a handful of ALU instructions results in a noticeable performance hit [around 5% on Apple M2], so those routines are kept separate) Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Ard Biesheuvel authored
The CCM code as originally written attempted to use as few NEON registers as possible, to avoid having to eagerly preserve/restore the entire NEON register file at every call to kernel_neon_begin/end. At that time, this API took a number of NEON registers as a parameter, and only preserved that many registers. Today, the NEON register file is restored lazily, and the old API is long gone. This means we can use as many NEON registers as we can make meaningful use of, which means in the AES case that we can keep all round keys in registers rather than reloading each of them for each AES block processed. On Cortex-A53, this results in a speedup of more than 50%. (From 4 cycles per byte to 2.6 cycles per byte) Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Ard Biesheuvel authored
CCM combines the counter (CTR) encryption mode with a MAC based on the same block cipher. This MAC construction is a bit clunky: it invokes the block cipher in a way that cannot be parallelized, resulting in poor CPU pipeline efficiency. The arm64 CCM code mitigates this by interleaving the encryption and MAC at the AES round level, resulting in a substantial speedup. But this approach does not apply to the additional authenticated data (AAD) which is not encrypted. This means the special asm routine dealing with the AAD is not any better than the MAC update routine used by the arm64 AES block encryption driver, so let's reuse that, and drop the special AES-CCM version. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Ard Biesheuvel authored
Implement the CCM tail handling using a single sequence that uses permute vectors and overlapping loads and stores, rather than going over the tail byte by byte in a loop, and using scalar operations. This is more efficient, even though the measured speedup is only around 1-2% on the CPUs I have tried. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Ard Biesheuvel authored
In preparation for optimizing the CCM core asm code using permutation vectors and overlapping loads and stores, ensure that inputs shorter than the size of a AES block are passed via a buffer on the stack, in a way that positions the data at the end of a 16 byte buffer. This removes the need for the asm code to reason about a rare corner case where the tail of the data cannot be read/written using a single NEON load/store instruction. While at it, tweak the copyright header and authorship to bring it up to date. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Ard Biesheuvel authored
Now that kernel mode NEON no longer disables preemption, we no longer have to take care to disable and re-enable use of the NEON when calling into the skcipher walk API. So just keep it enabled until done. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-
Ard Biesheuvel authored
This reverts commit 57ead1bf, which updated the CCM code to only rely on walk.nbytes to check for failures returned from the skcipher walk API, mostly for the common good rather than to fix a particular problem in the code. This change introduces a problem of its own: the skcipher walk is started with the 'atomic' argument set to false, which means that the skcipher walk API is permitted to sleep. Subsequently, it invokes skcipher_walk_done() with preemption disabled on the final iteration of the loop. This appears to work by accident, but it is arguably a bad example, and providing a better example was the point of the original patch. Given that future changes to the CCM code will rely on the original behavior of entering the loop even for zero sized inputs, let's just revert this change entirely, and proceed from there. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
-