- 12 Jul, 2022 40 commits
-
-
Ohad Sharabi authored
Currently we are not waiting for preboot ready after hard reset. This leads to a race in which COMMs protocol begins but will get no response from the f/w. Signed-off-by: Ohad Sharabi <osharabi@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
-
Ofir Bitton authored
Correctable ECC events are not fatal, but as they accumulate, the f/w can decide that a hard-rest is required. This indication is propagated to the host using the existing ECC event interface. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
-
Oded Gabbay authored
Enable the Gaudi2 ASIC code in the pci probe callback of the driver so the driver will handle Gaudi2 ASICs. Add the PCI ID to the PCI table and add the ASIC enum value to all relevant places. Fixup the device parameters initialization for Gaudi2. Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
-
Moti Haimovski authored
Gaudi2 has new MMU units. A PMMU for device->host accesses, and HMMU for HBM accesses. The page tables of both MMUs are located in the host's memory (referred to in the code as host-resident pgt). Signed-off-by: Moti Haimovski <mhaimovski@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
-
Oded Gabbay authored
In Gaudi2 we moved to a different wait for command submission completion model. Instead of receiving interrupt only on external queues, we use the device's sync manager to notify us when the entire command submission finishes. This enables us to remove the categorization of queues to external and internal, and treat each queue equally, without the need to parse and patch any command buffer. This change also requires refactoring to the IRQ handling of CS completions. Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
-
Benjamin Dotan authored
Add the Gaudi2 code to initialize the ASIC's profiler. The profile receives its initialization values from the user, same as in Gaudi2, but the code to initialize is in the driver because the configuration space of the device is not directly exposed to the user. Signed-off-by: Benjamin Dotan <bdotan@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
-
Ofir Bitton authored
Use the generic security module to block all registers in the ASIC and then open only those that are needed to be accessed by the user. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
-
Ofir Bitton authored
As the ASICs become more complex and have many more registers, we need a better way to configure the security properties. As a reminder, we have two dedicated mechanisms for security: Range Registers and Protection bits. Those mechanisms protect sensitive memory and configuration areas inside the device. The generic module handles the low-level part of the configuration, because the configuration mechanism is identical in all ASICs. The difference is the address ranges and register names. Any ASIC that use this block should first block all the register blocks in the ASIC. Then, it should open only the registers that need to be accessed by the user (This is opposed to Goya and Gaudi, where we blocked only what should not be accesses by the user). The module contains several functions, to unblock single register, multiple registers, entire blocks, ranges, ranges with mask. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
-
Oded Gabbay authored
There are a couple of device variables that are used for testing purposes and they are set to fixed values. Remove the variables that are not relevant anymore and document the remaining variables. Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
-
Oded Gabbay authored
New asic properties were added for Gaudi2. We want to initialize and use them, when relevant, also for Goya and Gaudi. Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
-
Oded Gabbay authored
There are a number of new ASIC-specific functions that were added for Gaudi2. To make the common code work, we need to define empty implementations of those functions for Goya and Gaudi. Some functions will return error if called with Goya/Gaudi. Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
-
Oded Gabbay authored
Add the ASIC-specific code for Gaudi2. Supply (almost) all of the function callbacks that the driver's common code need to initialize, finalize and submit workloads to the Gaudi2 ASIC. It also contains the code to initialize the F/W of the Gaudi2 ASIC and to receive events from the F/W. It contains new debugfs entry to dump razwi events. razwi is a case where the device's engines create a transaction that reaches an invalid destination. Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
-
Oded Gabbay authored
Add the new defines for GAUDI2 uapi interface. It includes the following: 1. Enums of engines and PLLs. 2. New information in the info IOCTL that is retrieved by the driver. 3. Update comments regarding the CB/CS/wait for CS ioctls. 4. New fields in the debug IOCTL for configuring the profiler for Gaudi2. There is no new IOCTL. Some of the changes are also relevant for Greco (which will be upstreamed later this year). When ever it says "Greco and onwards", it means it is also for Gaudi2. Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
-
Oded Gabbay authored
Add the relevant GAUDI2 ASIC registers header files. These files are generated automatically from a tool maintained by the VLSI engineers. There are more files which are not upstreamed because only very few defines from those files are used in the driver. For those files, I copied the relevant defines into gaudi2_regs.h and gaudi2_masks.h, to reduce the size of this patch. Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
-
Ofir Bitton authored
Region structure is derived from region type, hence no need to pass it as an argument. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
-
Oded Gabbay authored
PCI bar size is resource_size_t so we should use %pa to make it work correctly on all architectures. Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
-
Dafna Hirschfeld authored
in gaudi_scrub_device_mem, replace call to hl_poll_timeout with a while loop to avoid using dummy variables. Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Dafna Hirschfeld <dhirschfeld@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
-
Ohad Sharabi authored
Because in future ASICs the driver will allow the user to set the page size we need to make sure this data is propagated in all APIs. In addition, since this is already an ASIC property we no longer need ASIC function for it. Signed-off-by: Ohad Sharabi <osharabi@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
-
Tomer Tayar authored
free_device_memory() ends with if and else, each has a return statement, followed by another return statement that can never be reached. Restructure the function and remove this dead code. Signed-off-by: Tomer Tayar <ttayar@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
-
Oded Gabbay authored
We want to receive an error interrupt in case the watchdog timer expires on arbitration event in the queues. Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
-
Ohad Sharabi authored
We dropped support for page sizes that are not power of 2. Signed-off-by: Ohad Sharabi <osharabi@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
-
Ohad Sharabi authored
This is a pre-requisite patch for adding tracepoints to the DMA memory operations (allocation/free) in the driver. The main purpose is to be able to cross data with the map operations and determine whether memory violation occurred, for example free DMA allocation before unmapping it from device memory. To achieve this the DMA alloc/free code flows were refactored so that a single DMA tracepoint will catch many flows. Signed-off-by: Ohad Sharabi <osharabi@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
-
Oded Gabbay authored
Also beautify code by preferring single line wherever possible. Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
-
Oded Gabbay authored
This fixes a sparse warning of "cast truncates bits from constant value" Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
-
Oded Gabbay authored
packets are defined as LE so we need to convert before assigning values to them. Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
-
Oded Gabbay authored
function name in comment didn't match actual function name. Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
-
Oded Gabbay authored
The values in this enum are not used by h/w but are a contract between userspace and the kernel driver so they must be defined in the uapi file. Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
-
Dafna Hirschfeld authored
Set a default value for memory scrubbing Signed-off-by: Dafna Hirschfeld <dhirschfeld@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
-
Dafna Hirschfeld authored
In future ASICs, it would be possible to have a non-idle device when context is released. We thus need to postpone the scrubbing. Postpone it to hpriv release if reset is not executed or to device late init if reset is executed. Signed-off-by: Dafna Hirschfeld <dhirschfeld@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
-
Dafna Hirschfeld authored
In the callback scrub_device_mem, use 'memory_scrub_val' from debugfs for the scrubbing value. Signed-off-by: Dafna Hirschfeld <dhirschfeld@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
-
Dafna Hirschfeld authored
We use scrub_device_mem only to scrub the entire SRAM and entire DRAM. Therefore there is no need to send addr and size args to the callback. Signed-off-by: Dafna Hirschfeld <dhirschfeld@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
-
Dafna Hirschfeld authored
There is no need to do memory scrub when unmapping anymore as it is an overhead as long as we have a single user at any given time. Remove that code and change return value of free_phys_pg_pack to void Signed-off-by: Dafna Hirschfeld <dhirschfeld@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
-
Ofir Bitton authored
For easier debug, it is desirable to have a simple way to know whether the device is secured or not, hence we dump this indication during boot. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
-
Yuri Nudelman authored
There is a rare race condition in CB completion mechanism, that can occur under a very high pressure of command submissions. The preconditions for this to happen are: 1. There should be enough command submissions for the pre-allocated patched CB pool to run out of commands. At this stage we start allocating new patched CBs as they arrive. 2. CB size has to be exactly (128*n + 104)B for some n, i.e. 24B below a cache line end. The flow: 1. Two command buffers being completed on different streams, at the same time. Denote those CB1 and CB2. 2. Each command buffer is injected with two messages, 16B each - one for a HBW update of the completion queue, another to raise interrupt. 3. Assume CB1 updated the completion queue and raise the interrupt. 4. Assume CB2 updated the completion queue but did not raise the interrupt yet. 5. The host receives the interrupt. It goes over the completion queue and sees two completions - CB1 and CB2. Release them both. 6. CB2 performs the last command. The problem is that the last command is split between 2 cache lines. So to read the last 8B of the last command, it has to access the host again. Problem is - CB2 is already released. This causes a DMAR error. The solution to this problem is simply to make sure the last two commands in the CB are always in the same cache line, using NOP padding. Signed-off-by: Yuri Nudelman <ynudelman@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
-
Koby Elbaz authored
kernel test robot: "warning: variable 'index' is used uninitialized whenever 'if' condition is false" Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Koby Elbaz <kelbaz@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
-
Dafna Hirschfeld authored
move the field memory_scrub_val from struct hl_dbg_device_entry to struct hl_device. This is because we want to use this field also if debugfs is off. Signed-off-by: Dafna Hirschfeld <dhirschfeld@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
-
Oded Gabbay authored
function name should not be preceded with @ Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
-
Oded Gabbay authored
kvcalloc is same as kvmalloc_array with GFP_ZERO. Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
-
Oded Gabbay authored
Use %p instead of %llx for printing pointers. Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
-
Oded Gabbay authored
fence pointer can be NULL in this path, as shown by an earlier check. Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
-