- 22 May, 2022 40 commits
-
-
Ohad Sharabi authored
When user requests to prefetch the MMU translations, the driver will not block the user until prefetch is done. Instead, the prefetch work will be delegated to a WQ which will do it in the background. This way, the prefetch may progress without blocking the user at all. Signed-off-by: Ohad Sharabi <osharabi@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Yuri Nudelman authored
Changing format of memory manager messages to make it more readable. In addition, reducing the priority of a warning on missing handle during put. This scenario is not an indication of a problem and may happen in a legal flow, when handle is put from multiple flows. For example, in timeout and completion. Signed-off-by: Yuri Nudelman <ynudelman@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Oded Gabbay authored
If copy_to_user failed in info ioctl, we always return -EFAULT so the user will know there was an error. Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Oded Gabbay authored
eventfd is pointer. As such, it should be initialized to NULL, not to 0. In addition, no need to initialize it after creation because the entire structure is zeroed-out. Also, no need to initialize it before release because the entire structure is freed. Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Oded Gabbay authored
Update cpucp_if.h to latest version. Signed-off-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Tal Cohen authored
The driver will be able to send notification events towards a user process, using user's registered event file descriptor. The driver uses the notification mechanism to inform the user about an occurred event. A user thread can wait until a notification is received from the driver. The driver stores the occurred event until the user reads it, using HL_INFO_GET_EVENTS - new ioctl opcode in the INFO ioctl. Gaudi specific implementation includes sending a notification on a TPC assertion event that is received from f/w. Signed-off-by: Tal Cohen <talcohen@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Yuri Nudelman authored
Currently, buffers from multiple flows pass through the same infra. This way, in logs, we are unable to distinguish between buffers that came from separate flows. To address this problem, add a "topic" to buffer behavior descriptor - a string identifier that will be used to identify in logs the flow this buffer relates to. Signed-off-by: Yuri Nudelman <ynudelman@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Dani Liberman authored
Scenario: 1. During hard reset, driver executes device_kill_open_processes. 2. Drivers file descriptor is not closed yet (user process is alive), hence we are starting loop on all open file descriptors. 3. Just before getting task struct of user process, according to pid, SIGKILL is sent to the user process, hence get_pid_task fails, driver prints a warning and device_kill_open_processes returns an error. 4. Returned error causing driver fini do disable the device object of the process which causes a kernel crash. The fix is to handle this case not as an error and continue fini flow as normal, since the killed process (by the SIGKILL) will release its resources just like it will do when the driver sends him the sigkill. Signed-off-by: Dani Liberman <dliberman@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Dafna Hirschfeld authored
Add the ability to scrub the device memory with a given value. Add file 'dram_mem_scrub_val' to set the value and a file 'dram_mem_scrub' to scrub the dram. This is very important to help during automated tests, when you want the CI system to randomize the memory before training certain DL topologies. Signed-off-by: Dafna Hirschfeld <dhirschfeld@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Yuri Nudelman authored
With the new code required for the flow added, we can now switch to using the new memory manager infrastructure, removing the old code. Signed-off-by: Yuri Nudelman <ynudelman@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Yuri Nudelman authored
This commit adds the new code needed for command buffer flow using the new unified memory manager, without changing the actual functionality. Signed-off-by: Yuri Nudelman <ynudelman@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Ofir Bitton authored
In certain workloads, arbitration timeout might expire although no actual issue present. Hence, we set timeout to a very high value. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Yuri Nudelman authored
Putting object by its handle and not by object pointer is useful in some finalization flows that do not have object pointer available. It eliminates the need to first get the object and then perform put twice. Signed-off-by: Yuri Nudelman <ynudelman@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Yuri Nudelman authored
The new unified memory manager uses page offset to pass buffer handle during the mmap operation. One problem with this approach is that it requires the handle to always be divisible by the page size, else, the user would not be able to pass it correctly as an argument to the mmap system call. Previously, this was achieved by shifting the handle left after alloc operation, and shifting it right before get operation. This was done in the user code. This creates code duplication, and, what's worse, requires some knowledge from the user regarding the handle internal structure, hurting the encapsulation. This patch encloses all the page shifts inside memory manager functions. This way, the user can take the handle as a black box, and simply use it, without any concert about how it actually works. Signed-off-by: Yuri Nudelman <ynudelman@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
farah kassabri authored
Currently we're using the same poll interval value for both COMMs protocol(for sending a command and waits for an ACK) and the device CPU boot phases status waits. On COMMs protocol this interval should be much lower than the device CPU boot which may take long time to change status. Signed-off-by: farah kassabri <fkassabri@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Dani Liberman authored
find_get_pid() isn't good in case the user process was run inside docker. As a result, we didn't had the PID and we couldn't kill the user process in case the device got stuck and we needed to reset the device. Signed-off-by: Dani Liberman <dliberman@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Ohad Sharabi authored
This patch let the user decide whether the translations done in the page tables will be fetched directly to the STLB right after the map. We want to let the user control whether to perform prefetch upon map operation. To do so a memory flag was added, to be used in the MAP ioctl, called HL_MEM_PREFETCH and if set- the mappings will be fetched directly to the STLB after map operation. Signed-off-by: Ohad Sharabi <osharabi@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Robin Murphy authored
Even if an IOMMU might be present for some PCI segment in the system, that doesn't necessarily mean it provides translation for the device we care about. Replace iommu_present() with a more appropriate check. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Moti Haimovski authored
The habanalabs HW requires memory resources to be used by its internal hardware structures. These structures are allocated and initialized by the driver. We would like to use the device HBM for that purpose. This memory is io-remapped and accessed using the writel()/writeb()/writew() commands. Since some of the HW structures are one byte in size we need to add support for the writeb() and readb() functions in the driver. Signed-off-by: Moti Haimovski <mhaimovski@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Ohad Sharabi authored
Instead of using for_each_sg when iterating sgt that contains dma entries, use the more proper for_each_sgtable_dma_sg macro. In addition, both Goya and Gaudi have the exact same implementation of the asic function that encapsulate the usage of this macro, so it is better to move that implementation to the common code. Signed-off-by: Ohad Sharabi <osharabi@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Rajaravi Krishna Katta authored
Use standard kernel macro to take lower 32 bits of 64-bits variable. Signed-off-by: Rajaravi Krishna Katta <rkatta@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Ohad Sharabi authored
Take advantage of the HOPs shift/masks now defined as arrays. Signed-off-by: Ohad Sharabi <osharabi@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Rajaravi Krishna Katta authored
Incorrect/Missing doxygen tag Signed-off-by: Rajaravi Krishna Katta <rkatta@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Oded Gabbay authored
As user interrupts are a common use case, this dump pollutes the dmesg log, hence removing it. Signed-off-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Oded Gabbay authored
Only a hard-reset is an unexpected event which should be notify in the kernel log. Other resets are normal operations and therefore we should not pollute the log with them. Signed-off-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Oded Gabbay authored
Currently we have two reset prints per reset. One is in the common code and one in each asic-specific file. We can change the asic-specific message to be debug only as we can know the type of reset being done according to the print in the common code, which is also easier to maintain. Signed-off-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Oded Gabbay authored
Halting compute engines is a print that doesn't add us any information because it is always done in the reset process and not used elsewhere. Even if it was, we don't use prints to mark functions we passed through. Signed-off-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Yuri Nudelman authored
During the unified memory manager release, a wrong id was used to remove an entry from the idr. Signed-off-by: Yuri Nudelman <ynudelman@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Dafna Hirschfeld authored
The debugfs memory access now uses the callback 'access_dev_mem' so there is no use of the callbacks 'debugfs_{read32,read64,write32,write6}'. Remove them. Signed-off-by: Dafna Hirschfeld <dhirschfeld@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Dafna Hirschfeld authored
When accessing the configuration registers through debugfs, it is only allowed to access aligned address. Fail if address is not aligned. Signed-off-by: Dafna Hirschfeld <dhirschfeld@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Dafna Hirschfeld authored
Currently each asic version implements 4 callbacks: 'debugfs_{read32/write32/read64/write64}' There is a lot of code duplication among the different callbacks of all asic versions. This patch unify the code in order to avoid the code duplication by iterating the pci_mem_region array in hl_device and use its fields instead of macros. Signed-off-by: Dafna Hirschfeld <dhirschfeld@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Dafna Hirschfeld authored
This is a preparation for unifying the code of accessing device memory through debugfs. Add struct fields and callbacks that will later be used in debugfs code and will reduce code duplication among the different read{32,64}/write{32,64} callbacks of every asic. Signed-off-by: Dafna Hirschfeld <dhirschfeld@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
kernel test robot authored
drivers/misc/habanalabs/common/memory.c:2137:28: warning: symbol 'hl_ts_behavior' was not declared. Should it be static? Fixes: 4d530e7d ("habanalabs: convert ts to use unified memory manager") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: kernel test robot <lkp@intel.com> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Ohad Sharabi authored
When Gaudi device is secured the monitors data in the configuration space is blocked from PCI access. As we need to enable user to get sync-manager monitors registers when debugging, this patch adds a debugfs that dumps the information to a binary file (blob). When a root user will trigger the dump, the driver will send request to the f/w to fill a data structure containing dump of all monitors registers. Signed-off-by: Ohad Sharabi <osharabi@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Ohad Sharabi authored
The out of memory message is rephrased to more subtle expression as out of memory may be caused by the user in case of, for example, greedy allocation. In addition the user is also being notified by an error code. Signed-off-by: Ohad Sharabi <osharabi@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Dafna Hirschfeld authored
We currently allow accessing the whole SRAM bar size with the macro SRAM_BAR_SIZE, but the actual size of the sram region is the macro SRAM_SIZE which is only a portion of the whole bar size. So when accessing the sram through debugfs, use the macro SRAM_SIZE for the sram size which is the correct macro. Signed-off-by: Dafna Hirschfeld <dhirschfeld@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Ohad Sharabi authored
This is necessary pre-requisite for future ASIC support, where MMU TLB prefetch is supported. Signed-off-by: Ohad Sharabi <osharabi@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Yuri Nudelman authored
With the introduction of the unified memory manager infrastructure, the timestamp buffers can be converted to use it. Signed-off-by: Yuri Nudelman <ynudelman@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Yuri Nudelman authored
This is a part of overall refactoring attempt to separate nic and the core drivers. Currently, there are 4 different flows, that contain very similar code. These are the ts, nic, hwblocks and cb alloc/map flows. The similar aspect of all these flows is that they all contain a central store, with memory buffers inside, supporting the following set of operations: - Allocate buffer and return handle - Get buffer from the store with handle - Put the buffer (last put releases the buffer) - Map the buffer to the user This patch contains a generic data structure used to implement the above memory buffer store interface. Conversion of the existing code to use the new data structure will follow. Signed-off-by: Yuri Nudelman <ynudelman@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Ofir Bitton authored
We need this property for doing backward compatibility hacks against the f/w. Signed-off-by: Ofir Bitton <obitton@habana.ai> Reviewed-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Oded Gabbay <ogabbay@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-