- 21 Jul, 2020 8 commits
-
-
Christian König authored
Only functional change is to always keep io_reserved_count up to date for debugging even when it is not used otherwise. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/378242/
-
Christian König authored
Just use the use_io_reserve_lru flag. It doesn't make much sense to have two flags. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/378238/
-
Christian König authored
Nouveau is the only user of this functionality and evicting io space on -EAGAIN is really a misuse of the return code. Instead switch to using -ENOSPC here which makes much more sense and simplifies the code. This could unbreak something as we now cleanly return EAGAIN, but the chance for this are rather low. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/378237/
-
Christian König authored
Implementing those is completely unnecessary. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Madhav Chauhan <madhav.chauhan@amd.com> Link: https://patchwork.freedesktop.org/patch/378236/
-
Daniel Vetter authored
Comes up every few years, gets somewhat tedious to discuss, let's write this down once and for all. What I'm not sure about is whether the text should be more explicit in flat out mandating the amdkfd eviction fences for long running compute workloads or workloads where userspace fencing is allowed. v2: Now with dot graph! v3: Typo (Dave Airlie) Reviewed-by: Thomas Hellstrom <thomas.hellstrom@intel.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net> Acked-by: Christian König <christian.koenig@amd.com> Acked-by: Daniel Stone <daniels@collabora.com> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Jesse Natalie <jenatali@microsoft.com> Cc: Steve Pronovost <spronovo@microsoft.com> Cc: Jason Ekstrand <jason@jlekstrand.net> Cc: Felix Kuehling <Felix.Kuehling@amd.com> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Cc: Thomas Hellstrom <thomas.hellstrom@intel.com> Cc: linux-media@vger.kernel.org Cc: linaro-mm-sig@lists.linaro.org Cc: linux-rdma@vger.kernel.org Cc: amd-gfx@lists.freedesktop.org Cc: intel-gfx@lists.freedesktop.org Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Christian König <christian.koenig@amd.com> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200709123339.547390-1-daniel.vetter@ffwll.ch
-
Daniel Vetter authored
Two in one go: - it is allowed to call dma_fence_wait() while holding a dma_resv_lock(). This is fundamental to how eviction works with ttm, so required. - it is allowed to call dma_fence_wait() from memory reclaim contexts, specifically from shrinker callbacks (which i915 does), and from mmu notifier callbacks (which amdgpu does, and which i915 sometimes also does, and probably always should, but that's kinda a debate). Also for stuff like HMM we really need to be able to do this, or things get real dicey. Consequence is that any critical path necessary to get to a dma_fence_signal for a fence must never a) call dma_resv_lock nor b) allocate memory with GFP_KERNEL. Also by implication of dma_resv_lock(), no userspace faulting allowed. That's some supremely obnoxious limitations, which is why we need to sprinkle the right annotations to all relevant paths. The one big locking context we're leaving out here is mmu notifiers, added in commit 23b68395 Author: Daniel Vetter <daniel.vetter@ffwll.ch> Date: Mon Aug 26 22:14:21 2019 +0200 mm/mmu_notifiers: add a lockdep map for invalidate_range_start/end that one covers a lot of other callsites, and it's also allowed to wait on dma-fences from mmu notifiers. But there's no ready-made functions exposed to prime this, so I've left it out for now. v2: Also track against mmu notifier context. v3: kerneldoc to spec the cross-driver contract. Note that currently i915 throws in a hard-coded 10s timeout on foreign fences (not sure why that was done, but it's there), which is why that rule is worded with SHOULD instead of MUST. Also some of the mmu_notifier/shrinker rules might surprise SoC drivers, I haven't fully audited them all. Which is infeasible anyway, we'll need to run them with lockdep and dma-fence annotations and see what goes boom. v4: A spelling fix from Mika v5: #ifdef for CONFIG_MMU_NOTIFIER. Reported by 0day. Unfortunately this means lockdep enforcement is slightly inconsistent, it won't spot GFP_NOIO and GFP_NOFS allocations in the wrong spot if CONFIG_MMU_NOTIFIER is disabled in the kernel config. Oh well. v5: Note that only drivers/gpu has a reasonable (or at least historical) excuse to use dma_fence_wait() from shrinker and mmu notifier callbacks. Everyone else should either have a better memory manager model, or better hardware. This reflects discussions with Jason Gunthorpe. Cc: Jason Gunthorpe <jgg@mellanox.com> Cc: Felix Kuehling <Felix.Kuehling@amd.com> Cc: kernel test robot <lkp@intel.com> Acked-by: Christian König <christian.koenig@amd.com> Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@intel.com> (v4) Cc: Mika Kuoppala <mika.kuoppala@intel.com> Cc: Thomas Hellstrom <thomas.hellstrom@intel.com> Cc: linux-media@vger.kernel.org Cc: linaro-mm-sig@lists.linaro.org Cc: linux-rdma@vger.kernel.org Cc: amd-gfx@lists.freedesktop.org Cc: intel-gfx@lists.freedesktop.org Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Christian König <christian.koenig@amd.com> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200707201229.472834-3-daniel.vetter@ffwll.ch
-
Daniel Vetter authored
Design is similar to the lockdep annotations for workers, but with some twists: - We use a read-lock for the execution/worker/completion side, so that this explicit annotation can be more liberally sprinkled around. With read locks lockdep isn't going to complain if the read-side isn't nested the same way under all circumstances, so ABBA deadlocks are ok. Which they are, since this is an annotation only. - We're using non-recursive lockdep read lock mode, since in recursive read lock mode lockdep does not catch read side hazards. And we _very_ much want read side hazards to be caught. For full details of this limitation see commit e9149858 Author: Peter Zijlstra <peterz@infradead.org> Date: Wed Aug 23 13:13:11 2017 +0200 locking/lockdep/selftests: Add mixed read-write ABBA tests - To allow nesting of the read-side explicit annotations we explicitly keep track of the nesting. lock_is_held() allows us to do that. - The wait-side annotation is a write lock, and entirely done within dma_fence_wait() for everyone by default. - To be able to freely annotate helper functions I want to make it ok to call dma_fence_begin/end_signalling from soft/hardirq context. First attempt was using the hardirq locking context for the write side in lockdep, but this forces all normal spinlocks nested within dma_fence_begin/end_signalling to be spinlocks. That bollocks. The approach now is to simple check in_atomic(), and for these cases entirely rely on the might_sleep() check in dma_fence_wait(). That will catch any wrong nesting against spinlocks from soft/hardirq contexts. The idea here is that every code path that's critical for eventually signalling a dma_fence should be annotated with dma_fence_begin/end_signalling. The annotation ideally starts right after a dma_fence is published (added to a dma_resv, exposed as a sync_file fd, attached to a drm_syncobj fd, or anything else that makes the dma_fence visible to other kernel threads), up to and including the dma_fence_wait(). Examples are irq handlers, the scheduler rt threads, the tail of execbuf (after the corresponding fences are visible), any workers that end up signalling dma_fences and really anything else. Not annotated should be code paths that only complete fences opportunistically as the gpu progresses, like e.g. shrinker/eviction code. The main class of deadlocks this is supposed to catch are: Thread A: mutex_lock(A); mutex_unlock(A); dma_fence_signal(); Thread B: mutex_lock(A); dma_fence_wait(); mutex_unlock(A); Thread B is blocked on A signalling the fence, but A never gets around to that because it cannot acquire the lock A. Note that dma_fence_wait() is allowed to be nested within dma_fence_begin/end_signalling sections. To allow this to happen the read lock needs to be upgraded to a write lock, which means that any other lock is acquired between the dma_fence_begin_signalling() call and the call to dma_fence_wait(), and still held, this will result in an immediate lockdep complaint. The only other option would be to not annotate such calls, defeating the point. Therefore these annotations cannot be sprinkled over the code entirely mindless to avoid false positives. Originally I hope that the cross-release lockdep extensions would alleviate the need for explicit annotations: https://lwn.net/Articles/709849/ But there's a few reasons why that's not an option: - It's not happening in upstream, since it got reverted due to too many false positives: commit e966eaee Author: Ingo Molnar <mingo@kernel.org> Date: Tue Dec 12 12:31:16 2017 +0100 locking/lockdep: Remove the cross-release locking checks This code (CONFIG_LOCKDEP_CROSSRELEASE=y and CONFIG_LOCKDEP_COMPLETIONS=y), while it found a number of old bugs initially, was also causing too many false positives that caused people to disable lockdep - which is arguably a worse overall outcome. - cross-release uses the complete() call to annotate the end of critical sections, for dma_fence that would be dma_fence_signal(). But we do not want all dma_fence_signal() calls to be treated as critical, since many are opportunistic cleanup of gpu requests. If these get stuck there's still the main completion interrupt and workers who can unblock everyone. Automatically annotating all dma_fence_signal() calls would hence cause false positives. - cross-release had some educated guesses for when a critical section starts, like fresh syscall or fresh work callback. This would again cause false positives without explicit annotations, since for dma_fence the critical sections only starts when we publish a fence. - Furthermore there can be cases where a thread never does a dma_fence_signal, but is still critical for reaching completion of fences. One example would be a scheduler kthread which picks up jobs and pushes them into hardware, where the interrupt handler or another completion thread calls dma_fence_signal(). But if the scheduler thread hangs, then all the fences hang, hence we need to manually annotate it. cross-release aimed to solve this by chaining cross-release dependencies, but the dependency from scheduler thread to the completion interrupt handler goes through hw where cross-release code can't observe it. In short, without manual annotations and careful review of the start and end of critical sections, cross-relese dependency tracking doesn't work. We need explicit annotations. v2: handle soft/hardirq ctx better against write side and dont forget EXPORT_SYMBOL, drivers can't use this otherwise. v3: Kerneldoc. v4: Some spelling fixes from Mika v5: Amend commit message to explain in detail why cross-release isn't the solution. v6: Pull out misplaced .rst hunk. Acked-by: Christian König <christian.koenig@amd.com> Acked-by: Dave Airlie <airlied@redhat.com> Cc: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@intel.com> Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Cc: Thomas Hellstrom <thomas.hellstrom@intel.com> Cc: linux-media@vger.kernel.org Cc: linaro-mm-sig@lists.linaro.org Cc: linux-rdma@vger.kernel.org Cc: amd-gfx@lists.freedesktop.org Cc: intel-gfx@lists.freedesktop.org Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Christian König <christian.koenig@amd.com> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200707201229.472834-2-daniel.vetter@ffwll.ch
-
Christian König authored
The helper doesn't expose any not-mapable memory resources. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patchwork.freedesktop.org/patch/377649/
-
- 20 Jul, 2020 12 commits
-
-
Alexander A. Klimov authored
Rationale: Reduces attack surface on kernel devs opening the links for MITM as HTTPS traffic is much harder to manipulate. Deterministic algorithm: For each file: If not .svg: For each line: If doesn't contain `\bxmlns\b`: For each link, `\bhttp://[^# \t\r\n]*(?:\w|/)`: If neither `\bgnu\.org/license`, nor `\bmozilla\.org/MPL\b`: If both the HTTP and HTTPS versions return 200 OK and serve the same content: Replace HTTP with HTTPS. Signed-off-by: Alexander A. Klimov <grandmaster@al2klimov.de> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20200719203714.61745-1-grandmaster@al2klimov.de
-
Alexander A. Klimov authored
Rationale: Reduces attack surface on kernel devs opening the links for MITM as HTTPS traffic is much harder to manipulate. Deterministic algorithm: For each file: If not .svg: For each line: If doesn't contain `\bxmlns\b`: For each link, `\bhttp://[^# \t\r\n]*(?:\w|/)`: If neither `\bgnu\.org/license`, nor `\bmozilla\.org/MPL\b`: If both the HTTP and HTTPS versions return 200 OK and serve the same content: Replace HTTP with HTTPS. Signed-off-by: Alexander A. Klimov <grandmaster@al2klimov.de> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20200719171428.60470-1-grandmaster@al2klimov.de
-
Uwe Kleine-König authored
flags is unused since the driver was introduced in commit 45d59d70 ("drm: Add new driver for MXSFB controller"). Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: Stefan Agner <stefan@agner.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20200716174139.16602-1-u.kleine-koenig@pengutronix.de
-
Guido Günther authored
In contrast to other display controllers on imx like DCSS and ipuv3 lcdif/mxsfb does not support detiling e.g. vivante tiled layouts. Since mesa might assume otherwise make it explicit that only DRM_FORMAT_MOD_LINEAR is supported. Signed-off-by: Guido Günther <agx@sigxcpu.org> Reviewed-by: Lucas Stach <l.stach@pengutronix.de> Signed-off-by: Stefan Agner <stefan@agner.ch> Link: https://patchwork.freedesktop.org/patch/msgid/26877532e272c12a74c33188e2a72abafc9a2e1c.1584973664.git.agx@sigxcpu.org
-
Suraj Upadhyay authored
Convert device logging with dev_* functions into drm_* functions. The patch has been generated with the coccinelle script below. The script focuses on instances of dev_* functions where the drm device context is clearly visible in its arguments. @@expression E1; expression list E2; @@ -dev_warn(E1->dev, E2) +drm_warn(E1, E2) @@expression E1; expression list E2; @@ -dev_info(E1->dev, E2) +drm_info(E1, E2) @@expression E1; expression list E2; @@ -dev_err(E1->dev, E2) +drm_err(E1, E2) @@expression E1; expression list E2; @@ -dev_info_once(E1->dev, E2) +drm_info_once(E1, E2) @@expression E1; expression list E2; @@ -dev_notice_once(E1->dev, E2) +drm_notice_once(E1, E2) @@expression E1; expression list E2; @@ -dev_warn_once(E1->dev, E2) +drm_warn_once(E1, E2) @@expression E1; expression list E2; @@ -dev_err_once(E1->dev, E2) +drm_err_once(E1, E2) @@expression E1; expression list E2; @@ -dev_err_ratelimited(E1->dev, E2) +drm_err_ratelimited(E1, E2) @@expression E1; expression list E2; @@ -dev_dbg(E1->dev, E2) +drm_dbg(E1, E2) Signed-off-by: Suraj Upadhyay <usuraj35@gmail.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20200718150955.GA23103@blackclown
-
Christophe JAILLET authored
The wrappers in include/linux/pci-dma-compat.h should go away. The patch has been generated with the coccinelle script below and has been hand modified to replace GFP_ with a correct flag. It has been compile tested. When memory is allocated in 'i810_dma_initialize()' GFP_KERNEL can be used because its only caller, 'i810_dma_init()', already use it and no lock is taken in the between. @@ @@ - PCI_DMA_BIDIRECTIONAL + DMA_BIDIRECTIONAL @@ @@ - PCI_DMA_TODEVICE + DMA_TO_DEVICE @@ @@ - PCI_DMA_FROMDEVICE + DMA_FROM_DEVICE @@ @@ - PCI_DMA_NONE + DMA_NONE @@ expression e1, e2, e3; @@ - pci_alloc_consistent(e1, e2, e3) + dma_alloc_coherent(&e1->dev, e2, e3, GFP_) @@ expression e1, e2, e3; @@ - pci_zalloc_consistent(e1, e2, e3) + dma_alloc_coherent(&e1->dev, e2, e3, GFP_) @@ expression e1, e2, e3, e4; @@ - pci_free_consistent(e1, e2, e3, e4) + dma_free_coherent(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_map_single(e1, e2, e3, e4) + dma_map_single(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_unmap_single(e1, e2, e3, e4) + dma_unmap_single(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4, e5; @@ - pci_map_page(e1, e2, e3, e4, e5) + dma_map_page(&e1->dev, e2, e3, e4, e5) @@ expression e1, e2, e3, e4; @@ - pci_unmap_page(e1, e2, e3, e4) + dma_unmap_page(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_map_sg(e1, e2, e3, e4) + dma_map_sg(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_unmap_sg(e1, e2, e3, e4) + dma_unmap_sg(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_dma_sync_single_for_cpu(e1, e2, e3, e4) + dma_sync_single_for_cpu(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_dma_sync_single_for_device(e1, e2, e3, e4) + dma_sync_single_for_device(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_dma_sync_sg_for_cpu(e1, e2, e3, e4) + dma_sync_sg_for_cpu(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_dma_sync_sg_for_device(e1, e2, e3, e4) + dma_sync_sg_for_device(&e1->dev, e2, e3, e4) @@ expression e1, e2; @@ - pci_dma_mapping_error(e1, e2) + dma_mapping_error(&e1->dev, e2) @@ expression e1, e2; @@ - pci_set_dma_mask(e1, e2) + dma_set_mask(&e1->dev, e2) @@ expression e1, e2; @@ - pci_set_consistent_dma_mask(e1, e2) + dma_set_coherent_mask(&e1->dev, e2) Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20200718072822.339064-1-christophe.jaillet@wanadoo.fr
-
Thomas Zimmermann authored
Cleaning up ast's MM code with ast_mm_fini() resets the write-combine flags on the VRAM I/O memory. Drop ast_mm_fini() in favor of an auto- release callback. Releasing the device also executes the callback. Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Reviewed-by: Sam Ravnborg <sam@ravnborg.org> Link: https://patchwork.freedesktop.org/patch/msgid/20200716125353.31512-7-tzimmermann@suse.de
-
Thomas Zimmermann authored
Posting the GPU requires the correct DRAM type to be stored in struct ast_private. Therefore first initialize the DRAM info and then post the GPU. This restores the original order of instructions in this function. Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Reviewed-by: Sam Ravnborg <sam@ravnborg.org> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Fixes: bad09da6 ("drm/ast: Fixed vram size incorrect issue on POWER") Cc: Joel Stanley <joel@jms.id.au> Cc: Y.C. Chen <yc_chen@aspeedtech.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Dave Airlie <airlied@redhat.com> Cc: Thomas Zimmermann <tzimmermann@suse.de> Cc: Gerd Hoffmann <kraxel@redhat.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Sam Ravnborg <sam@ravnborg.org> Cc: Emil Velikov <emil.l.velikov@gmail.com> Cc: "Y.C. Chen" <yc_chen@aspeedtech.com> Cc: <stable@vger.kernel.org> # v4.11+ Link: https://patchwork.freedesktop.org/patch/msgid/20200716125353.31512-6-tzimmermann@suse.de
-
Thomas Zimmermann authored
VRAM size detection is only relevant to the memory management. Move the code into ast_mm.c. While at it, rename the function to ast_get_vram_size(). The function argument's type is now struct ast_private. The result is stored in a local variable and not in struct ast_private any longer. Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Reviewed-by: Sam Ravnborg <sam@ravnborg.org> Link: https://patchwork.freedesktop.org/patch/msgid/20200716125353.31512-5-tzimmermann@suse.de
-
Thomas Zimmermann authored
As a first step to managed MM code in ast, switch over VRAM MM helpers. v2: * updated to use drmm_vram_helper_init() Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Reviewed-by: Sam Ravnborg <sam@ravnborg.org> Link: https://patchwork.freedesktop.org/patch/msgid/20200716125353.31512-4-tzimmermann@suse.de
-
Thomas Zimmermann authored
Although built upon TTM, the ast driver's VRAM MM helper does not expose TTM to its users. Rename the related ast file to ast_mm.c. Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Reviewed-by: Sam Ravnborg <sam@ravnborg.org> Link: https://patchwork.freedesktop.org/patch/msgid/20200716125353.31512-3-tzimmermann@suse.de
-
Thomas Zimmermann authored
Calling drmm_vram_helper_init() sets up a managed instance of VRAM MM. Releasing the DRM device also frees the memory manager. The patch also updates the DRM documentation for VRAM helpers. The tutorial now describes the new managed interface. The old interfaces are deprecated and should not be used in new code. v2: * rename init function to drmm_vram_helper_init() * return errno code from init function; caller does not need vram_mm anyway * update documentation and remove docs for deprecated un-managed functions Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Reviewed-by: Sam Ravnborg <sam@ravnborg.org> Link: https://patchwork.freedesktop.org/patch/msgid/20200716125353.31512-2-tzimmermann@suse.de
-
- 19 Jul, 2020 1 commit
-
-
Paul Cercueil authored
Silence compiler warning about used but uninitialized 'ipu_state' variable. In practice, the variable would never be used when uninitialized, but the compiler cannot know that 'priv->ipu_plane' will always be NULL if CONFIG_INGENIC_IPU is disabled. Silence the warning by initializing the value to NULL. Signed-off-by: Paul Cercueil <paul@crapouillou.net> Reviewed-by: Sam Ravnborg <sam@ravnborg.org> Link: https://patchwork.freedesktop.org/patch/msgid/20200719093834.14084-1-paul@crapouillou.net
-
- 16 Jul, 2020 19 commits
-
-
Paul Cercueil authored
Bump version to 1.1 and set date to 2020-07-16. v3: New patch Signed-off-by: Paul Cercueil <paul@crapouillou.net> Reviewed-by: Sam Ravnborg <sam@ravnborg.org> Link: https://patchwork.freedesktop.org/patch/msgid/20200716163846.174790-12-paul@crapouillou.net
-
Paul Cercueil authored
Support multiple panels or bridges connected to the same DPI output of the SoC. This setup can be found for instance on the GCW Zero, where the same DPI output interfaces the internal 320x240 TFT panel, and the ITE IT6610 HDMI chip. v2: No change v3: Allow > 80-char lines where it makes sense Signed-off-by: Paul Cercueil <paul@crapouillou.net> Reviewed-by: Sam Ravnborg <sam@ravnborg.org> Link: https://patchwork.freedesktop.org/patch/msgid/20200716163846.174790-11-paul@crapouillou.net
-
Paul Cercueil authored
Add support for the Image Processing Unit (IPU) found in all Ingenic SoCs. The IPU can upscale and downscale a source frame of arbitrary size ranging from 4x4 to 4096x4096 on newer SoCs, with bicubic filtering on newer SoCs, bilinear filtering on older SoCs. Nearest-neighbour can also be obtained with proper coefficients. Starting from the JZ4725B, the IPU supports a mode where its output is sent directly to the LCDC, without having to be written to RAM first. This makes it possible to use the IPU as a DRM plane on the compatible SoCs, and have it convert and scale anything the userspace asks for to what's available for the display. Regarding pixel formats, older SoCs support packed YUV 4:2:2 and various planar YUV formats. Newer SoCs introduced support for RGB. Since the IPU is a separate hardware block, to make it work properly the Ingenic DRM driver will now register itself as a component master in case the IPU driver has been enabled in the config. When enabled in the config, the CRTC will see the IPU as a second primary plane. It cannot be enabled at the same time as the regular primary plane. It has the same priority, which means that it will also display below the overlay plane. v2: - ingenic-ipu is no longer its own module. It will be built into the ingenic-drm module. - If enabled in the config, both the core driver and the IPU driver will register as components; otherwise the core driver will bypass that and call the ingenic_drm_bind() function directly. - Since both files now build into the same module, the symbols previously exported as GPL are not exported anymore, since they are only used internally. - Fix SPDX license header in ingenic-ipu.h - Avoid using 'for(;;);' loops without trailing statement(s) v3: - Pass priv structure to IRQ handler; that way we don't hardcode the expectation that the IPU plane is at index #0. - Rework osd_changed() to account for src_* changes - Add multiplanar YUV 4:4:4 support - Commit fb addresses to HW at vblank, since addr registers are not shadow registers - Probe IPU component later so that IPU plane is last - Fix driver not working on IPU-less hardware - Use IPU driver's name as the IRQ name to avoid having two 'ingenic-drm' in /proc/interrupts - Fix IPU only working for still images on JZ4725B - Add a bit more code comments Signed-off-by: Paul Cercueil <paul@crapouillou.net> Reviewed-by: Sam Ravnborg <sam@ravnborg.org> Link: https://patchwork.freedesktop.org/patch/msgid/20200716163846.174790-10-paul@crapouillou.net
-
Paul Cercueil authored
All Ingenic SoCs starting from the JZ4725B support OSD mode. In this mode, two separate planes can be used. They can have different positions and sizes, and one can be overlayed on top of the other. v2: Use fallthrough; instead of /* fall-through */ v3: - Add custom atomic_tail function to handle case where HW gives no VBLANK - Use regmap_set_bits() / regmap_clear_bits() when possible - Use dma_hwdesc_f{0,1} fields in priv structure instead of array - Use dmam_alloc_coherent() instead of dma_alloc_coherent() - Use more meaningful 0xf0 / 0xf1 values as DMA descriptors IDs - Add a bit more code comments Signed-off-by: Paul Cercueil <paul@crapouillou.net> Reviewed-by: Sam Ravnborg <sam@ravnborg.org> Link: https://patchwork.freedesktop.org/patch/msgid/20200716163846.174790-9-paul@crapouillou.net
-
Paul Cercueil authored
Use dmam_alloc_coherent() instead of dma_alloc_coherent(). Then we don't need to register a custom cleanup handler. v3: New patch Signed-off-by: Paul Cercueil <paul@crapouillou.net> Reviewed-by: Sam Ravnborg <sam@ravnborg.org> Link: https://patchwork.freedesktop.org/patch/msgid/20200716163846.174790-8-paul@crapouillou.net
-
Paul Cercueil authored
Move the register definitions to ingenic-drm.h, to keep ingenic-drm-drv.c tidy. v2: Fix SPDX license tag v3: No change Signed-off-by: Paul Cercueil <paul@crapouillou.net> Reviewed-by: Sam Ravnborg <sam@ravnborg.org> Link: https://patchwork.freedesktop.org/patch/msgid/20200716163846.174790-7-paul@crapouillou.net
-
Paul Cercueil authored
The address of the DMA descriptor never changes. It can therefore be set in the probe function. v2-v3: No change Signed-off-by: Paul Cercueil <paul@crapouillou.net> Reviewed-by: Sam Ravnborg <sam@ravnborg.org> Link: https://patchwork.freedesktop.org/patch/msgid/20200716163846.174790-6-paul@crapouillou.net
-
Paul Cercueil authored
If you pass a string that is not terminated with a carriage return to dev_err(), it will eventually be printed with a carriage return, but not right away, since the kernel will wait for a pr_cont(). v2: New patch v3: No change Signed-off-by: Paul Cercueil <paul@crapouillou.net> Reviewed-by: Sam Ravnborg <sam@ravnborg.org> Link: https://patchwork.freedesktop.org/patch/msgid/20200716163846.174790-5-paul@crapouillou.net
-
Paul Cercueil authored
Full rename without any modification, except to the Makefile. Renaming ingenic-drm.c to ingenic-drm-drv.c allow to decouple the module name from the source file name in the Makefile. This will be useful later when more source files are added. v2: New patch v3: No change Signed-off-by: Paul Cercueil <paul@crapouillou.net> Reviewed-by: Sam Ravnborg <sam@ravnborg.org> Link: https://patchwork.freedesktop.org/patch/msgid/20200716163846.174790-4-paul@crapouillou.net
-
Paul Cercueil authored
Add documentation of the Device Tree bindings for the Image Processing Unit (IPU) found in most Ingenic SoCs. v2: Add missing 'const' in items list v3: No change Signed-off-by: Paul Cercueil <paul@crapouillou.net> Reviewed-by: Rob Herring <robh@kernel.org> Reviewed-by: Sam Ravnborg <sam@ravnborg.org> Link: https://patchwork.freedesktop.org/patch/msgid/20200716163846.174790-3-paul@crapouillou.net
-
Paul Cercueil authored
Convert the ingenic,lcd.txt to a new ingenic,lcd.yaml file. In the process, the new ingenic,jz4780-lcd compatible string has been added. v2: Add info about IPU at port@8 v3: No change Signed-off-by: Paul Cercueil <paul@crapouillou.net> Reviewed-by: Rob Herring <robh@kernel.org> Reviewed-by: Sam Ravnborg <sam@ravnborg.org> Link: https://patchwork.freedesktop.org/patch/msgid/20200716163846.174790-2-paul@crapouillou.net
-
Paul Cercueil authored
plane->index is NOT the index of the color plane in a YUV frame. Actually, a YUV frame is represented by a single drm_plane, even though it contains three Y, U, V planes. v2-v3: No change Cc: stable@vger.kernel.org # v5.3 Fixes: 90b86fcc ("DRM: Add KMS driver for the Ingenic JZ47xx SoCs") Signed-off-by: Paul Cercueil <paul@crapouillou.net> Reviewed-by: Sam Ravnborg <sam@ravnborg.org> Link: https://patchwork.freedesktop.org/patch/msgid/20200716163846.174790-1-paul@crapouillou.net
-
Lyude Paul authored
While I had thought I'd tested this before, it looks like this one issue slipped by my original CRC patches. Basically, there seem to be a few rules we need to follow when sending CRC commands to the display controller: * CRCs cannot be both disabled and enabled for a single head in the same flush * If a head with CRC reporting enabled switches from one OR to another, there must be a flush before the OR is re-enabled regardless of the final state of CRC reporting. So, split nv50_crc_atomic_prepare_notifier_contexts() into two functions: * nv_crc_atomic_release_notifier_contexts() - checks whether the CRC notifier contexts were released successfully after the first flush * nv_crc_atomic_init_notifier_contexts() - prepares any CRC notifier contexts for use before enabling reporting Additionally, in order to force a flush when we re-assign ORs with heads that have CRCs enabled we split our atomic check function into two: * nv50_crc_atomic_check_head() - called from our heads' atomic checks, determines whether a state needs to set or clear CRC reporting * nv50_crc_atomic_check_outp() - called at the end of the atomic check after all ORs have been added to the atomic state, and sets nv50_atom->flush_disable if needed Signed-off-by: Lyude Paul <lyude@redhat.com> Reviewed-by: Ben Skeggs <skeggsb@gmail.com> Acked-by: Dave Airlie <airlied@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200629223635.103804-1-lyude@redhat.com
-
Lyude Paul authored
This introduces support for CRC readback on gf119+, using the documentation generously provided to us by Nvidia: https://github.com/NVIDIA/open-gpu-doc/blob/master/Display-CRC/display-crc.txt We expose all available CRC sources. SF, SOR, PIOR, and DAC are exposed through a single set of "outp" sources: outp-active/auto for a CRC of the scanout region, outp-complete for a CRC of both the scanout and blanking/sync region combined, and outp-inactive for a CRC of only the blanking/sync region. For each source, nouveau selects the appropriate tap point based on the output path in use. We also expose an "rg" source, which allows for capturing CRCs of the scanout raster before it's encoded into a video signal in the output path. This tap point is referred to as the raster generator. Note that while there's some other neat features that can be used with CRC capture on nvidia hardware, like capturing from two CRC sources simultaneously, I couldn't see any usecase for them and did not implement them. Nvidia only allows for accessing CRCs through a shared DMA region that we program through the core EVO/NvDisplay channel which is referred to as the notifier context. The notifier context is limited to either 255 (for Fermi-Pascal) or 2047 (Volta+) entries to store CRCs in, and unfortunately the hardware simply drops CRCs and reports an overflow once all available entries in the notifier context are filled. Since the DRM CRC API and igt-gpu-tools don't expect there to be a limit on how many CRCs can be captured, we work around this in nouveau by allocating two separate notifier contexts for each head instead of one. We schedule a vblank worker ahead of time so that once we start getting close to filling up all of the available entries in the notifier context, we can swap the currently used notifier context out with another pre-prepared notifier context in a manner similar to page flipping. Unfortunately, the hardware only allows us to this by flushing two separate updates on the core channel: one to release the current notifier context handle, and one to program the next notifier context's handle. When the hardware processes the first update, the CRC for the current frame is lost. However, the second update can be flushed immediately without waiting for the first to complete so that CRC generation resumes on the next frame. According to Nvidia's hardware engineers, there isn't any cleaner way of flipping notifier contexts that would avoid this. Since using vblank workers to swap out the notifier context will ensure we can usually flush both updates to hardware within the timespan of a single frame, we can also ensure that there will only be exactly one frame lost between the first and second update being executed by the hardware. This gives us the guarantee that we're always correctly matching each CRC entry with it's respective frame even after a context flip. And since IGT will retrieve the CRC entry for a frame by waiting until it receives a CRC for any subsequent frames, this doesn't cause an issue with any tests and is much simpler than trying to change the current DRM API to accommodate. In order to facilitate testing of correct handling of this limitation, we also expose a debugfs interface to manually control the threshold for when we start trying to flip the notifier context. We will use this in igt to trigger a context flip for testing purposes without needing to wait for the notifier to completely fill up. This threshold is reset to the default value set by nouveau after each capture, and is exposed in a separate folder within each CRTC's debugfs directory labelled "nv_crc". Changes since v1: * Forgot to finish saving crc.h before saving, whoops. This just adds some corrections to the empty function declarations that we use if CONFIG_DEBUG_FS isn't enabled. Changes since v2: * Don't check return code from debugfs_create_dir() or debugfs_create_file() - Greg K-H Changes since v3: (no functional changes) * Fix SPDX license identifiers (checkpatch) * s/uint32_t/u32/ (checkpatch) * Fix indenting in switch cases (checkpatch) Changes since v4: * Remove unneeded param changes with nv50_head_flush_clr/set * Rebase Changes since v5: * Remove set but unused variable (outp) in nv50_crc_atomic_check() - Kbuild bot Signed-off-by: Lyude Paul <lyude@redhat.com> Reviewed-by: Ben Skeggs <bskeggs@redhat.com> Acked-by: Dave Airlie <airlied@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200627194657.156514-10-lyude@redhat.com
-
Lyude Paul authored
While most of the functionality on Nvidia GPUs doesn't require using an explicit handle instead of the main VRAM handle + offset, there are a couple of places that do require explicit handles, such as CRC functionality. Since this means we're about to add another nouveau-chosen handle, let's just go ahead and move any hard-coded handles into a single header. This is just to keep things slightly organized, and to make it a little bit easier if we need to add more handles in the future. This patch should contain no functional changes. Changes since v3: * Correct SPDX license identifier (checkpatch) Signed-off-by: Lyude Paul <lyude@redhat.com> Reviewed-by: Ben Skeggs <bskeggs@redhat.com> Acked-by: Dave Airlie <airlied@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200627194657.156514-9-lyude@redhat.com
-
Lyude Paul authored
In order to make sure that we flush disable updates at the right time when disabling CRCs, we'll need to be able to look at the outp state to see if we're changing it at the same time that we're disabling CRCs. So, expose the struct in disp.h. Signed-off-by: Lyude Paul <lyude@redhat.com> Reviewed-by: Ben Skeggs <bskeggs@redhat.com> Acked-by: Dave Airlie <airlied@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200627194657.156514-8-lyude@redhat.com
-
Lyude Paul authored
While we're not quite ready yet to add support for flexible wndw mappings, we are going to need to at least keep track of the static wndw mappings we're currently using in each head's atomic state. We'll likely use this in the future to implement real flexible window mapping, but the primary reason we'll need this is for CRC support. See: on nvidia hardware, each CRC entry in the CRC notifier dma context has a "tag". This tag corresponds to the nth update on a specific EVO/NvDisplay channel, which itself is referred to as the "controlling channel". For gf119+ this can be the core channel, ovly channel, or base channel. Since we don't expose CRC entry tags to userspace, we simply ignore this feature and always use the core channel as the controlling channel. Simple. Things get a little bit more complicated on gv100+ though. GV100+ only lets us set the controlling channel to a specific wndw channel, and that wndw must be owned by the head that we're grabbing CRCs when we enable CRC generation. Thus, we always need to make sure that each atomic head state has at least one wndw that is mapped to the head, which will be used as the controlling channel. Note that since we don't have flexible wndw mappings yet, we don't expect to run into any scenarios yet where we'd have a head with no mapped wndws. When we do add support for flexible wndw mappings however, we'll need to make sure that we handle reprogramming CRC capture if our controlling wndw is moved to another head (and potentially reject the new head state entirely if we can't find another available wndw to replace it). With that being said, nouveau currently tracks wndw visibility on heads. It does not keep track of the actual ownership mappings, which are (currently) statically programmed. To fix this, we introduce another bitmask into nv50_head_atom.wndw to keep track of ownership separately from visibility. We then introduce a nv50_head callback to handle populating the wndw ownership map, and call it during the atomic check phase when core->assign_windows is set to true. Signed-off-by: Lyude Paul <lyude@redhat.com> Reviewed-by: Ben Skeggs <bskeggs@redhat.com> Acked-by: Dave Airlie <airlied@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200627194657.156514-7-lyude@redhat.com
-
Lyude Paul authored
While we expose the ability to turn off hardware dithering for nouveau, we actually make the mistake of turning it on anyway, due to dithering_depth containing a non-zero value if our dithering depth isn't also set to 6 bpc. So, fix it by never enabling dithering when it's disabled. Signed-off-by: Lyude Paul <lyude@redhat.com> Reviewed-by: Ben Skeggs <bskeggs@redhat.com> Acked-by: Dave Airlie <airlied@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200627194657.156514-6-lyude@redhat.com
-
Lyude Paul authored
Currently, we modify the depth value stored in the atomic state when performing a commit in order to workaround the fact we haven't implemented support for depths higher then 10 yet. This isn't idempotent though, as it will happen every atomic commit where we modify the OR state even if the head's depth in the atomic state hasn't been modified. Normally this wouldn't matter, since we don't modify OR state outside of modesets, but since the CRC capture region is implemented as part of the OR state in hardware we'll want to make sure all commits modifying OR state are idempotent so as to avoid changing the depth unexpectedly. So, fix this by simply not writing the reduced depth value we come up with to the atomic state. Signed-off-by: Lyude Paul <lyude@redhat.com> Reviewed-by: Ben Skeggs <bskeggs@redhat.com> Acked-by: Dave Airlie <airlied@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200627194657.156514-5-lyude@redhat.com
-