- 01 Sep, 2021 17 commits
-
-
Yuri Nudelman authored
It is useful to have the ability to see which user address was pinned to which physical address during the initial mapping. We already have all that info stored, but no means to search this data (which may be quite large). Signed-off-by: Yuri Nudelman <ynudelman@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
-
Ofir Bitton authored
In case F/W security is enabled driver cannot access ECC registers, hence driver must fetch the ECC info from F/W. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
-
Ohad Sharabi authored
During the integration, the multi-CS requirements were refined: - The multi CS call shall wait on "per-ASIC" predefined stream masters instead of set of streams. - Stream masters are set of QIDs used by the upper SW layers (synapse) for completion (must be an external/HW queue). Signed-off-by: Ohad Sharabi <osharabi@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
-
Alon Mizrahi authored
Current "state dump" is lacking of monitored SOB IDs. Add for convenience. Signed-off-by: Alon Mizrahi <amizrahi@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
-
Oded Gabbay authored
Because we don't have multiple contexts in GAUDI, and to minimize calls to is_idle function (which uses many register reads), move the call to clear the user registers to the opening of the single user context. Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
-
Oded Gabbay authored
Various f/w versions have different timeouts, so increase the default timeout to accommodate all the options. Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
-
Oded Gabbay authored
Add several new packets between driver and firmware. Add matching compatibility bits for backward compatibility. Add support for 4K event types. Add information about pcie errors. Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
-
Oded Gabbay authored
Because the register reads might be trapped by the hypervisor in certain deployments, minimize the number of reads during runtime by moving static initializations to functions that occur during device initialization instead of context open. Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
-
Yuri Nudelman authored
The address resolution via debugfs was not taking into consideration the page offset, resulting in a wrong address. Signed-off-by: Yuri Nudelman <ynudelman@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
-
Yuri Nudelman authored
Currently userptr endpoint in debugfs prints out virtual addresses in the user process memory space, without specifying their owner process ID. User space virtual address is meaningless without knowing the owner process. Signed-off-by: Yuri Nudelman <ynudelman@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
-
Oded Gabbay authored
HW init is mostly about configuring registers. Therefore, it is better to activate DMAs only in late init and afterwards. Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
-
Ofir Bitton authored
In order to enhance debuggability, we will scrub the whole HBM to a specific value, in case HBM scrubbing is enabled. Scrubbing will be performed after reset and after user closes the FD. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
-
Ofir Bitton authored
Currently there is no validity check for event ID received from F/W, Thus exposing driver to memory overrun. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
-
Koby Elbaz authored
For some ASICs, the f/w reads the msg_to_cpu_reg value after reset, and for some it doesn't. Therefore, to be sure f/w doesn't read a wrong value after reset, we need to clear this register before the reset occurs. Signed-off-by: Koby Elbaz <kelbaz@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
-
Ohad Sharabi authored
In order to better support variants of the same ASIC the set_pci_regions function is now an ASIC function which allows each ASIC to implement it internally, thus keeping all definitions static to the file. Signed-off-by: Ohad Sharabi <osharabi@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
-
Ohad Sharabi authored
Done as the bar size can exceed 4GB. Signed-off-by: Ohad Sharabi <osharabi@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
-
Oded Gabbay authored
Add the server type property to the hl_info_hw_ip_info structure that is exposed to the user via the INFO IOCTL. This is needed by the userspace s/w stack to know the connections map of the internal links that connect the ASIC among themselves inside the server. The F/W will tell us, as part of the NIC information, the server type that the GAUDI is located in. Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
-
- 29 Aug, 2021 23 commits
-
-
Oded Gabbay authored
This warning is redundant as we will print a notice in case the device is still in use after the FD was closed. No need to print the same message per context. Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
-
farah kassabri authored
This commit is the second part of the encapsulated signals feature. It contains the driver support for submission of cs with encapsulated signals and the wait for them. Signed-off-by: farah kassabri <fkassabri@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
-
farah kassabri authored
The signaling from within encapsulated OP capability is merged into the existing stream architecture, such that one can trigger multiple signaling from an encapsulated op, according to the time the event was done in the graph execution and avoid the need to wait for the whole encapsulated OP execution to be complete before the stream can signal. This commit implements only the reserve/unreserve part. Signed-off-by: farah kassabri <fkassabri@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
-
farah kassabri authored
Currently the SOB reset was in fence release function which happens only at the CS wraparound during the CS allocation time. In order to support the new encapsulated signals reservation feature, we need to move the SOB reset to an earlier phase because this SOB could reach it's max value very fast using the signal reservation. Signed-off-by: farah kassabri <fkassabri@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
-
Ohad Sharabi authored
When user sends multiple CSs, waiting for each CS is not efficient as it involves many user-kernel context switches. In order to address this issue we add support to "wait on multiple CSs" using a new uAPI which can wait on maximum of 32 CSs. The new uAPI is defined using a new flag - WAIT_FOR_MULTI_CS - in the wait_for_cs IOCTL. The input parameters for this uAPI will be: @seq: user pointer to an array of up to 32 CS's sequence numbers. @seq_array_len: length of sequence array. @timeout_us: timeout for waiting for any CS. The output paramateres for this API will be: @status: multi CS ioctl completion status (dedicated status was added as well). @flags: bitmap of output flags of the CS. @cs_completion_map: bitmap for multi CS, if CS sequence that was placed in index N in input seq array has completed- the N-th bit in cs_completion_map will be 1, otherwise it will be 0. @timestamp_nsec: timestamp of the first completed CS Signed-off-by: Ohad Sharabi <osharabi@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
-
Ohad Sharabi authored
To add proper support for wait-for-multi-CS, locking the CS lock for each CS fence in the list is not efficient. Instead, this patch add support to lock the CS lock once to get all required fences. Signed-off-by: Ohad Sharabi <osharabi@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
-
Oded Gabbay authored
The driver quietly handles memory mappings that were not freed so no need to print a warning about that when user closes the FD. Accordingly, revise the text that is printed in case the device is still in use after the user process closed the FD. Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
-
Oded Gabbay authored
Need to initialize f/w Linux loaded indication to false to prevent wrong communication with the f/w. Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
-
Oded Gabbay authored
Add two new fields regarding interrupts communication between driver and f/w. Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
-
Koby Elbaz authored
There is a scenario where an ongoing soft reset would race with an ongoing heartbeat routine, eventually causing heartbeat to fail and thus to escalate into a hard reset. With this fix, soft-reset procedure will disable heartbeat CPU messages and flush the (ongoing) current one before continuing with reset code. Signed-off-by: Koby Elbaz <kelbaz@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
-
Oded Gabbay authored
Print the SM name instead of index because it is more informational for the user to know the SM name instead of id when a SM interrupt occurs. In addition, the index that is printed is of the SOB group, not a specific SOB. Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
-
Ofir Bitton authored
State dump is relevant to the user in case of Sync Manager error, so we need to trigger it in that case as well. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
-
Oded Gabbay authored
This is required from any device that is capable to perform DMA. Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
-
Oded Gabbay authored
Each ASIC can have a different offset to add to a host dma address, to enable the ASIC to access that host memory. The usage for this can be common code so add this to the asic property structure. Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
-
Oded Gabbay authored
Recently, the size parameter in userptr structure was change to u64. As a result, we need to change the type of the local range_size in device_va_to_pa() to u64 to avoid overflow. Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
-
Tomer Tayar authored
If hard reset fails after the call to hw_fini and before loading the linux image to the device, a subsequent call to hw_fini should communicate via COMMS (or MSG_TO_CPU regs for old FW versions). However, the driver still tries in this case to communicate via the GIC, and thus no hard reset is actually done. To avoid that, the patch clears the linux_loaded flag after every call to hw_fini. Signed-off-by: Tomer Tayar <ttayar@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
-
Tomer Tayar authored
In case of host-resident MMU, when the page tables pool is destroyed, its pointer is not nullified correctly. As a result, on a device fini which happens after a failing reset, the already destroyed pool is accessed, which leads to a kernel panic. The patch fixes the setting of the pool pointer to NULL. Signed-off-by: Tomer Tayar <ttayar@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
-
Zvika Yehudai authored
This function will be used for more mmap operations than just mmaping CBs. Signed-off-by: Zvika Yehudai <zyehudai@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
-
Ofir Bitton authored
missing mutex unlock once driver is giving up killing user processes. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
-
Yuri Nudelman authored
At the first stage, only gaudi core dump shall be implemented, not including the status registers. Signed-off-by: Yuri Nudelman <ynudelman@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
-
Yuri Nudelman authored
With the infrastructure in place, monitors and fences dump shall be implemented. Signed-off-by: Yuri Nudelman <ynudelman@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
-
Yuri Nudelman authored
To improve the user's ability to debug the case where a workload that is part of executing training/inference of a topology is getting stuck, we need to add a 'core dump' each time a CS times-out. The 'core dump' shall contain all relevant Sync Manager information and corresponding fence values. The most recent dumps shall be accessible via debugfs, under 'state_dump' node. Reading from the node will provide the oldest dump available. Writing an integer value X will discard X dumps, starting with the oldest one, i.e. subsequent read will now return newer dumps. Signed-off-by: Yuri Nudelman <ynudelman@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
-
Oded Gabbay authored
The previous function we used, find_get_pid(), wasn't good in case the user process was run inside docker. As a result, we didn't had the PID and we couldn't kill the user process in case the device got stuck and we needed to reset the device. Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
-