Commits · c1c440d41aff2fa22027ed51afcc4c69709515eb · nexedi / linux

21 Jul, 2020 8 commits

drm/ttm: cleanup coding style and implementation. · c1c440d4

Christian König authored Jul 15, 2020

Only functional change is to always keep io_reserved_count up to date
for debugging even when it is not used otherwise.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/378242/

c1c440d4

drm/ttm: remove io_reserve_fastpath flag · ce747733

Christian König authored Jul 15, 2020

Just use the use_io_reserve_lru flag. It doesn't make much
sense to have two flags.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/378238/

ce747733

drm/ttm: cleanup io_mem interface with nouveau · 4b8edc39

Christian König authored Jul 13, 2020

Nouveau is the only user of this functionality and evicting io space
on -EAGAIN is really a misuse of the return code.

Instead switch to using -ENOSPC here which makes much more sense and
simplifies the code.

This could unbreak something as we now cleanly return EAGAIN, but the
chance for this are rather low.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/378237/

4b8edc39

drm: remove optional dummy function from drivers using TTM · e69acf18

Christian König authored Jul 10, 2020

Implementing those is completely unnecessary.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Madhav Chauhan <madhav.chauhan@amd.com>
Link: https://patchwork.freedesktop.org/patch/378236/

e69acf18

dma-buf.rst: Document why indefinite fences are a bad idea · 72b6ede7

Daniel Vetter authored Jul 09, 2020

Comes up every few years, gets somewhat tedious to discuss, let's
write this down once and for all.

What I'm not sure about is whether the text should be more explicit in
flat out mandating the amdkfd eviction fences for long running compute
workloads or workloads where userspace fencing is allowed.

v2: Now with dot graph!

v3: Typo (Dave Airlie)
Reviewed-by: Thomas Hellstrom <thomas.hellstrom@intel.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Daniel Stone <daniels@collabora.com>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Jesse Natalie <jenatali@microsoft.com>
Cc: Steve Pronovost <spronovo@microsoft.com>
Cc: Jason Ekstrand <jason@jlekstrand.net>
Cc: Felix Kuehling <Felix.Kuehling@amd.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Thomas Hellstrom <thomas.hellstrom@intel.com>
Cc: linux-media@vger.kernel.org
Cc: linaro-mm-sig@lists.linaro.org
Cc: linux-rdma@vger.kernel.org
Cc: amd-gfx@lists.freedesktop.org
Cc: intel-gfx@lists.freedesktop.org
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Christian König <christian.koenig@amd.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200709123339.547390-1-daniel.vetter@ffwll.ch

72b6ede7

dma-fence: prime lockdep annotations · d0b9a9ae

Daniel Vetter authored Jul 07, 2020

Two in one go:
- it is allowed to call dma_fence_wait() while holding a
  dma_resv_lock(). This is fundamental to how eviction works with ttm,
  so required.

- it is allowed to call dma_fence_wait() from memory reclaim contexts,
  specifically from shrinker callbacks (which i915 does), and from mmu
  notifier callbacks (which amdgpu does, and which i915 sometimes also
  does, and probably always should, but that's kinda a debate). Also
  for stuff like HMM we really need to be able to do this, or things
  get real dicey.

Consequence is that any critical path necessary to get to a
dma_fence_signal for a fence must never a) call dma_resv_lock nor b)
allocate memory with GFP_KERNEL. Also by implication of
dma_resv_lock(), no userspace faulting allowed. That's some supremely
obnoxious limitations, which is why we need to sprinkle the right
annotations to all relevant paths.

The one big locking context we're leaving out here is mmu notifiers,
added in

commit 23b68395
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date:   Mon Aug 26 22:14:21 2019 +0200

    mm/mmu_notifiers: add a lockdep map for invalidate_range_start/end

that one covers a lot of other callsites, and it's also allowed to
wait on dma-fences from mmu notifiers. But there's no ready-made
functions exposed to prime this, so I've left it out for now.

v2: Also track against mmu notifier context.

v3: kerneldoc to spec the cross-driver contract. Note that currently
i915 throws in a hard-coded 10s timeout on foreign fences (not sure
why that was done, but it's there), which is why that rule is worded
with SHOULD instead of MUST.

Also some of the mmu_notifier/shrinker rules might surprise SoC
drivers, I haven't fully audited them all. Which is infeasible anyway,
we'll need to run them with lockdep and dma-fence annotations and see
what goes boom.

v4: A spelling fix from Mika

v5: #ifdef for CONFIG_MMU_NOTIFIER. Reported by 0day. Unfortunately
this means lockdep enforcement is slightly inconsistent, it won't spot
GFP_NOIO and GFP_NOFS allocations in the wrong spot if
CONFIG_MMU_NOTIFIER is disabled in the kernel config. Oh well.

v5: Note that only drivers/gpu has a reasonable (or at least
historical) excuse to use dma_fence_wait() from shrinker and mmu
notifier callbacks. Everyone else should either have a better memory
manager model, or better hardware. This reflects discussions with
Jason Gunthorpe.

Cc: Jason Gunthorpe <jgg@mellanox.com>
Cc: Felix Kuehling <Felix.Kuehling@amd.com>
Cc: kernel test robot <lkp@intel.com>
Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@intel.com> (v4)
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Thomas Hellstrom <thomas.hellstrom@intel.com>
Cc: linux-media@vger.kernel.org
Cc: linaro-mm-sig@lists.linaro.org
Cc: linux-rdma@vger.kernel.org
Cc: amd-gfx@lists.freedesktop.org
Cc: intel-gfx@lists.freedesktop.org
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Christian König <christian.koenig@amd.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200707201229.472834-3-daniel.vetter@ffwll.ch

d0b9a9ae

dma-fence: basic lockdep annotations · 5fbff813

Daniel Vetter authored Jul 07, 2020

Design is similar to the lockdep annotations for workers, but with
some twists:

- We use a read-lock for the execution/worker/completion side, so that
  this explicit annotation can be more liberally sprinkled around.
  With read locks lockdep isn't going to complain if the read-side
  isn't nested the same way under all circumstances, so ABBA deadlocks
  are ok. Which they are, since this is an annotation only.

- We're using non-recursive lockdep read lock mode, since in recursive
  read lock mode lockdep does not catch read side hazards. And we
  _very_ much want read side hazards to be caught. For full details of
  this limitation see

  commit e9149858
  Author: Peter Zijlstra <peterz@infradead.org>
  Date:   Wed Aug 23 13:13:11 2017 +0200

      locking/lockdep/selftests: Add mixed read-write ABBA tests

- To allow nesting of the read-side explicit annotations we explicitly
  keep track of the nesting. lock_is_held() allows us to do that.

- The wait-side annotation is a write lock, and entirely done within
  dma_fence_wait() for everyone by default.

- To be able to freely annotate helper functions I want to make it ok
  to call dma_fence_begin/end_signalling from soft/hardirq context.
  First attempt was using the hardirq locking context for the write
  side in lockdep, but this forces all normal spinlocks nested within
  dma_fence_begin/end_signalling to be spinlocks. That bollocks.

  The approach now is to simple check in_atomic(), and for these cases
  entirely rely on the might_sleep() check in dma_fence_wait(). That
  will catch any wrong nesting against spinlocks from soft/hardirq
  contexts.

The idea here is that every code path that's critical for eventually
signalling a dma_fence should be annotated with
dma_fence_begin/end_signalling. The annotation ideally starts right
after a dma_fence is published (added to a dma_resv, exposed as a
sync_file fd, attached to a drm_syncobj fd, or anything else that
makes the dma_fence visible to other kernel threads), up to and
including the dma_fence_wait(). Examples are irq handlers, the
scheduler rt threads, the tail of execbuf (after the corresponding
fences are visible), any workers that end up signalling dma_fences and
really anything else. Not annotated should be code paths that only
complete fences opportunistically as the gpu progresses, like e.g.
shrinker/eviction code.

The main class of deadlocks this is supposed to catch are:

Thread A:

	mutex_lock(A);
	mutex_unlock(A);

	dma_fence_signal();

Thread B:

	mutex_lock(A);
	dma_fence_wait();
	mutex_unlock(A);

Thread B is blocked on A signalling the fence, but A never gets around
to that because it cannot acquire the lock A.

Note that dma_fence_wait() is allowed to be nested within
dma_fence_begin/end_signalling sections. To allow this to happen the
read lock needs to be upgraded to a write lock, which means that any
other lock is acquired between the dma_fence_begin_signalling() call and
the call to dma_fence_wait(), and still held, this will result in an
immediate lockdep complaint. The only other option would be to not
annotate such calls, defeating the point. Therefore these annotations
cannot be sprinkled over the code entirely mindless to avoid false
positives.

Originally I hope that the cross-release lockdep extensions would
alleviate the need for explicit annotations:

https://lwn.net/Articles/709849/

But there's a few reasons why that's not an option:

- It's not happening in upstream, since it got reverted due to too
  many false positives:

	commit e966eaee
	Author: Ingo Molnar <mingo@kernel.org>
	Date:   Tue Dec 12 12:31:16 2017 +0100

	    locking/lockdep: Remove the cross-release locking checks

	    This code (CONFIG_LOCKDEP_CROSSRELEASE=y and CONFIG_LOCKDEP_COMPLETIONS=y),
	    while it found a number of old bugs initially, was also causing too many
	    false positives that caused people to disable lockdep - which is arguably
	    a worse overall outcome.

- cross-release uses the complete() call to annotate the end of
  critical sections, for dma_fence that would be dma_fence_signal().
  But we do not want all dma_fence_signal() calls to be treated as
  critical, since many are opportunistic cleanup of gpu requests. If
  these get stuck there's still the main completion interrupt and
  workers who can unblock everyone. Automatically annotating all
  dma_fence_signal() calls would hence cause false positives.

- cross-release had some educated guesses for when a critical section
  starts, like fresh syscall or fresh work callback. This would again
  cause false positives without explicit annotations, since for
  dma_fence the critical sections only starts when we publish a fence.

- Furthermore there can be cases where a thread never does a
  dma_fence_signal, but is still critical for reaching completion of
  fences. One example would be a scheduler kthread which picks up jobs
  and pushes them into hardware, where the interrupt handler or
  another completion thread calls dma_fence_signal(). But if the
  scheduler thread hangs, then all the fences hang, hence we need to
  manually annotate it. cross-release aimed to solve this by chaining
  cross-release dependencies, but the dependency from scheduler thread
  to the completion interrupt handler goes through hw where
  cross-release code can't observe it.

In short, without manual annotations and careful review of the start
and end of critical sections, cross-relese dependency tracking doesn't
work. We need explicit annotations.

v2: handle soft/hardirq ctx better against write side and dont forget
EXPORT_SYMBOL, drivers can't use this otherwise.

v3: Kerneldoc.

v4: Some spelling fixes from Mika

v5: Amend commit message to explain in detail why cross-release isn't
the solution.

v6: Pull out misplaced .rst hunk.
Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Dave Airlie <airlied@redhat.com>
Cc: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Thomas Hellstrom <thomas.hellstrom@intel.com>
Cc: linux-media@vger.kernel.org
Cc: linaro-mm-sig@lists.linaro.org
Cc: linux-rdma@vger.kernel.org
Cc: amd-gfx@lists.freedesktop.org
Cc: intel-gfx@lists.freedesktop.org
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Christian König <christian.koenig@amd.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200707201229.472834-2-daniel.vetter@ffwll.ch

5fbff813

drm/vram-helper: stop using TTM_MEMTYPE_FLAG_MAPPABLE · 23f166ca

Christian König authored Jul 16, 2020

The helper doesn't expose any not-mapable memory resources.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/377649/

23f166ca

20 Jul, 2020 12 commits

video: fbdev: Replace HTTP links with HTTPS ones · 7c7b2a35

Alexander A. Klimov authored Jul 19, 2020

Rationale:
Reduces attack surface on kernel devs opening the links for MITM
as HTTPS traffic is much harder to manipulate.

Deterministic algorithm:
For each file:
  If not .svg:
    For each line:
      If doesn't contain `\bxmlns\b`:
        For each link, `\bhttp://[^# \t\r\n]*(?:\w|/)`:
	  If neither `\bgnu\.org/license`, nor `\bmozilla\.org/MPL\b`:
            If both the HTTP and HTTPS versions
            return 200 OK and serve the same content:
              Replace HTTP with HTTPS.
Signed-off-by: Alexander A. Klimov <grandmaster@al2klimov.de>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20200719203714.61745-1-grandmaster@al2klimov.de

7c7b2a35

drm: Replace HTTP links with HTTPS ones · b0487e0d

Alexander A. Klimov authored Jul 19, 2020

Rationale:
Reduces attack surface on kernel devs opening the links for MITM
as HTTPS traffic is much harder to manipulate.

Deterministic algorithm:
For each file:
  If not .svg:
    For each line:
      If doesn't contain `\bxmlns\b`:
        For each link, `\bhttp://[^# \t\r\n]*(?:\w|/)`:
	  If neither `\bgnu\.org/license`, nor `\bmozilla\.org/MPL\b`:
            If both the HTTP and HTTPS versions
            return 200 OK and serve the same content:
              Replace HTTP with HTTPS.
Signed-off-by: Alexander A. Klimov <grandmaster@al2klimov.de>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20200719171428.60470-1-grandmaster@al2klimov.de

b0487e0d

drm/mxsfb: drop unused function parameter · cf73db84

Uwe Kleine-König authored Jul 16, 2020

flags is unused since the driver was introduced in commit 45d59d70
("drm: Add new driver for MXSFB controller").
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Stefan Agner <stefan@agner.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20200716174139.16602-1-u.kleine-koenig@pengutronix.de

cf73db84

drm/mxsfb: Make supported modifiers explicit · f4b29bf7

Guido Günther authored Mar 23, 2020

In contrast to other display controllers on imx like DCSS and ipuv3
lcdif/mxsfb does not support detiling e.g. vivante tiled layouts.
Since mesa might assume otherwise make it explicit that only
DRM_FORMAT_MOD_LINEAR is supported.
Signed-off-by: Guido Günther <agx@sigxcpu.org>
Reviewed-by: Lucas Stach <l.stach@pengutronix.de>
Signed-off-by: Stefan Agner <stefan@agner.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/26877532e272c12a74c33188e2a72abafc9a2e1c.1584973664.git.agx@sigxcpu.org

f4b29bf7

drm: core: Convert device logging to drm_* functions. · 6d45fff5

Suraj Upadhyay authored Jul 18, 2020

Convert device logging with dev_* functions into drm_* functions.

The patch has been generated with the coccinelle script below.
The script focuses on instances of dev_* functions where the drm device
context is clearly visible in its arguments.

@@expression E1; expression list E2; @@
-dev_warn(E1->dev, E2)
+drm_warn(E1, E2)

@@expression E1; expression list E2; @@
-dev_info(E1->dev, E2)
+drm_info(E1, E2)

@@expression E1; expression list E2; @@
-dev_err(E1->dev, E2)
+drm_err(E1, E2)

@@expression E1; expression list E2; @@
-dev_info_once(E1->dev, E2)
+drm_info_once(E1, E2)

@@expression E1; expression list E2; @@
-dev_notice_once(E1->dev, E2)
+drm_notice_once(E1, E2)

@@expression E1; expression list E2; @@
-dev_warn_once(E1->dev, E2)
+drm_warn_once(E1, E2)

@@expression E1; expression list E2; @@
-dev_err_once(E1->dev, E2)
+drm_err_once(E1, E2)

@@expression E1; expression list E2; @@
-dev_err_ratelimited(E1->dev, E2)
+drm_err_ratelimited(E1, E2)

@@expression E1; expression list E2; @@
-dev_dbg(E1->dev, E2)
+drm_dbg(E1, E2)
Signed-off-by: Suraj Upadhyay <usuraj35@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20200718150955.GA23103@blackclown

6d45fff5

drm/i810: switch from 'pci_' to 'dma_' API · 880a7485

Christophe JAILLET authored Jul 18, 2020

The wrappers in include/linux/pci-dma-compat.h should go away.

The patch has been generated with the coccinelle script below and has been
hand modified to replace GFP_ with a correct flag.
It has been compile tested.

When memory is allocated in 'i810_dma_initialize()' GFP_KERNEL can be used
because its only caller, 'i810_dma_init()', already use it and no lock is
taken in the between.

@@
@@
-    PCI_DMA_BIDIRECTIONAL
+    DMA_BIDIRECTIONAL

@@
@@
-    PCI_DMA_TODEVICE
+    DMA_TO_DEVICE

@@
@@
-    PCI_DMA_FROMDEVICE
+    DMA_FROM_DEVICE

@@
@@
-    PCI_DMA_NONE
+    DMA_NONE

@@
expression e1, e2, e3;
@@
-    pci_alloc_consistent(e1, e2, e3)
+    dma_alloc_coherent(&e1->dev, e2, e3, GFP_)

@@
expression e1, e2, e3;
@@
-    pci_zalloc_consistent(e1, e2, e3)
+    dma_alloc_coherent(&e1->dev, e2, e3, GFP_)

@@
expression e1, e2, e3, e4;
@@
-    pci_free_consistent(e1, e2, e3, e4)
+    dma_free_coherent(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_map_single(e1, e2, e3, e4)
+    dma_map_single(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_unmap_single(e1, e2, e3, e4)
+    dma_unmap_single(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4, e5;
@@
-    pci_map_page(e1, e2, e3, e4, e5)
+    dma_map_page(&e1->dev, e2, e3, e4, e5)

@@
expression e1, e2, e3, e4;
@@
-    pci_unmap_page(e1, e2, e3, e4)
+    dma_unmap_page(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_map_sg(e1, e2, e3, e4)
+    dma_map_sg(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_unmap_sg(e1, e2, e3, e4)
+    dma_unmap_sg(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_single_for_cpu(e1, e2, e3, e4)
+    dma_sync_single_for_cpu(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_single_for_device(e1, e2, e3, e4)
+    dma_sync_single_for_device(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_sg_for_cpu(e1, e2, e3, e4)
+    dma_sync_sg_for_cpu(&e1->dev, e2, e3, e4)

@@
expression e1, e2, e3, e4;
@@
-    pci_dma_sync_sg_for_device(e1, e2, e3, e4)
+    dma_sync_sg_for_device(&e1->dev, e2, e3, e4)

@@
expression e1, e2;
@@
-    pci_dma_mapping_error(e1, e2)
+    dma_mapping_error(&e1->dev, e2)

@@
expression e1, e2;
@@
-    pci_set_dma_mask(e1, e2)
+    dma_set_mask(&e1->dev, e2)

@@
expression e1, e2;
@@
-    pci_set_consistent_dma_mask(e1, e2)
+    dma_set_coherent_mask(&e1->dev, e2)
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20200718072822.339064-1-christophe.jaillet@wanadoo.fr

880a7485

drm/ast: Use managed MM initialization · 03ba7e00

Thomas Zimmermann authored Jul 16, 2020

Cleaning up ast's MM code with ast_mm_fini() resets the write-combine
flags on the VRAM I/O memory. Drop ast_mm_fini() in favor of an auto-
release callback. Releasing the device also executes the callback.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Sam Ravnborg <sam@ravnborg.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20200716125353.31512-7-tzimmermann@suse.de

03ba7e00

drm/ast: Initialize DRAM type before posting GPU · 244d0128

Thomas Zimmermann authored Jul 16, 2020

Posting the GPU requires the correct DRAM type to be stored in
struct ast_private. Therefore first initialize the DRAM info and
then post the GPU. This restores the original order of instructions
in this function.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Sam Ravnborg <sam@ravnborg.org>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Fixes: bad09da6 ("drm/ast: Fixed vram size incorrect issue on POWER")
Cc: Joel Stanley <joel@jms.id.au>
Cc: Y.C. Chen <yc_chen@aspeedtech.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Emil Velikov <emil.l.velikov@gmail.com>
Cc: "Y.C. Chen" <yc_chen@aspeedtech.com>
Cc: <stable@vger.kernel.org> # v4.11+
Link: https://patchwork.freedesktop.org/patch/msgid/20200716125353.31512-6-tzimmermann@suse.de

244d0128

drm/ast: Move VRAM size detection to ast_mm.c · 0149e780

Thomas Zimmermann authored Jul 16, 2020

VRAM size detection is only relevant to the memory management. Move
the code into ast_mm.c.

While at it, rename the function to ast_get_vram_size(). The function
argument's type is now struct ast_private. The result is stored in a
local variable and not in struct ast_private any longer.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Sam Ravnborg <sam@ravnborg.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20200716125353.31512-5-tzimmermann@suse.de

0149e780

drm/ast: Use managed VRAM-helper initialization · 8e46dc58

Thomas Zimmermann authored Jul 16, 2020

As a first step to managed MM code in ast, switch over VRAM MM helpers.

v2:
	* updated to use drmm_vram_helper_init()
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Sam Ravnborg <sam@ravnborg.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20200716125353.31512-4-tzimmermann@suse.de

8e46dc58

drm/ast: Rename ast_ttm.c to ast_mm.c · 48fde424

Thomas Zimmermann authored Jul 16, 2020

Although built upon TTM, the ast driver's VRAM MM helper does not
expose TTM to its users. Rename the related ast file to ast_mm.c.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Sam Ravnborg <sam@ravnborg.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20200716125353.31512-3-tzimmermann@suse.de

48fde424

drm/vram-helper: Managed vram helpers · a5f23a72

Thomas Zimmermann authored Jul 16, 2020

Calling drmm_vram_helper_init() sets up a managed instance of
VRAM MM. Releasing the DRM device also frees the memory manager.

The patch also updates the DRM documentation for VRAM helpers. The
tutorial now describes the new managed interface. The old interfaces
are deprecated and should not be used in new code.

v2:
	* rename init function to drmm_vram_helper_init()
	* return errno code from init function; caller does not
	  need vram_mm anyway
	* update documentation and remove docs for deprecated
	  un-managed functions
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Sam Ravnborg <sam@ravnborg.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20200716125353.31512-2-tzimmermann@suse.de

a5f23a72

19 Jul, 2020 1 commit

drm/ingenic: Silence uninitialized-variable warning · 40a55dc1

Paul Cercueil authored Jul 19, 2020

Silence compiler warning about used but uninitialized 'ipu_state'
variable. In practice, the variable would never be used when
uninitialized, but the compiler cannot know that 'priv->ipu_plane' will
always be NULL if CONFIG_INGENIC_IPU is disabled.

Silence the warning by initializing the value to NULL.
Signed-off-by: Paul Cercueil <paul@crapouillou.net>
Reviewed-by: Sam Ravnborg <sam@ravnborg.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20200719093834.14084-1-paul@crapouillou.net

40a55dc1

16 Jul, 2020 19 commits

drm/ingenic: Bump driver to version 1.1 · a786e8ca

Paul Cercueil authored Jul 16, 2020

Bump version to 1.1 and set date to 2020-07-16.

v3: New patch
Signed-off-by: Paul Cercueil <paul@crapouillou.net>
Reviewed-by: Sam Ravnborg <sam@ravnborg.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20200716163846.174790-12-paul@crapouillou.net

a786e8ca

drm/ingenic: Support multiple panels/bridges · c369cb27

Paul Cercueil authored Jul 16, 2020

Support multiple panels or bridges connected to the same DPI output of
the SoC. This setup can be found for instance on the GCW Zero, where the
same DPI output interfaces the internal 320x240 TFT panel, and the ITE
IT6610 HDMI chip.

v2: No change

v3: Allow > 80-char lines where it makes sense
Signed-off-by: Paul Cercueil <paul@crapouillou.net>
Reviewed-by: Sam Ravnborg <sam@ravnborg.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20200716163846.174790-11-paul@crapouillou.net

c369cb27

drm/ingenic: Add support for the IPU · fc1acf31

Paul Cercueil authored Jul 16, 2020

Add support for the Image Processing Unit (IPU) found in all Ingenic
SoCs.

The IPU can upscale and downscale a source frame of arbitrary size
ranging from 4x4 to 4096x4096 on newer SoCs, with bicubic filtering
on newer SoCs, bilinear filtering on older SoCs. Nearest-neighbour can
also be obtained with proper coefficients.

Starting from the JZ4725B, the IPU supports a mode where its output is
sent directly to the LCDC, without having to be written to RAM first.
This makes it possible to use the IPU as a DRM plane on the compatible
SoCs, and have it convert and scale anything the userspace asks for to
what's available for the display.

Regarding pixel formats, older SoCs support packed YUV 4:2:2 and various
planar YUV formats. Newer SoCs introduced support for RGB.

Since the IPU is a separate hardware block, to make it work properly the
Ingenic DRM driver will now register itself as a component master in
case the IPU driver has been enabled in the config.

When enabled in the config, the CRTC will see the IPU as a second primary
plane. It cannot be enabled at the same time as the regular primary
plane. It has the same priority, which means that it will also display
below the overlay plane.

v2: - ingenic-ipu is no longer its own module. It will be built
into the ingenic-drm module.
- If enabled in the config, both the core driver and the IPU
driver will register as components; otherwise the core
driver will bypass that and call the ingenic_drm_bind()
function directly.
- Since both files now build into the same module, the
symbols previously exported as GPL are not exported anymore,
since they are only used internally.
- Fix SPDX license header in ingenic-ipu.h
- Avoid using 'for(;;);' loops without trailing statement(s)

v3: - Pass priv structure to IRQ handler; that way we don't hardcode
the expectation that the IPU plane is at index #0.
- Rework osd_changed() to account for src_* changes
- Add multiplanar YUV 4:4:4 support
- Commit fb addresses to HW at vblank, since addr registers are
not shadow registers
- Probe IPU component later so that IPU plane is last
- Fix driver not working on IPU-less hardware
- Use IPU driver's name as the IRQ name to avoid having two
'ingenic-drm' in /proc/interrupts
- Fix IPU only working for still images on JZ4725B
- Add a bit more code comments
Signed-off-by: Paul Cercueil <paul@crapouillou.net>
Reviewed-by: Sam Ravnborg <sam@ravnborg.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20200716163846.174790-10-paul@crapouillou.net

fc1acf31

drm/ingenic: Add support for OSD mode · 3c9bea4e

Paul Cercueil authored Jul 16, 2020

All Ingenic SoCs starting from the JZ4725B support OSD mode.

In this mode, two separate planes can be used. They can have different
positions and sizes, and one can be overlayed on top of the other.

v2: Use fallthrough; instead of /* fall-through */

v3: - Add custom atomic_tail function to handle case where HW gives no
      VBLANK
    - Use regmap_set_bits() / regmap_clear_bits() when possible
    - Use dma_hwdesc_f{0,1} fields in priv structure instead of array
    - Use dmam_alloc_coherent() instead of dma_alloc_coherent()
    - Use more meaningful 0xf0 / 0xf1 values as DMA descriptors IDs
    - Add a bit more code comments
Signed-off-by: Paul Cercueil <paul@crapouillou.net>
Reviewed-by: Sam Ravnborg <sam@ravnborg.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20200716163846.174790-9-paul@crapouillou.net

3c9bea4e

drm/ingenic: Use dmam_alloc_coherent() · 0a746db7

Paul Cercueil authored Jul 16, 2020

Use dmam_alloc_coherent() instead of dma_alloc_coherent(). Then we don't
need to register a custom cleanup handler.

v3: New patch
Signed-off-by: Paul Cercueil <paul@crapouillou.net>
Reviewed-by: Sam Ravnborg <sam@ravnborg.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20200716163846.174790-8-paul@crapouillou.net

0a746db7

drm/ingenic: Move register definitions to ingenic-drm.h · 4b11cb7f

Paul Cercueil authored Jul 16, 2020

Move the register definitions to ingenic-drm.h, to keep
ingenic-drm-drv.c tidy.

v2: Fix SPDX license tag
v3: No change
Signed-off-by: Paul Cercueil <paul@crapouillou.net>
Reviewed-by: Sam Ravnborg <sam@ravnborg.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20200716163846.174790-7-paul@crapouillou.net

4b11cb7f

drm/ingenic: Set DMA descriptor chain address in probe · e5507d2c

Paul Cercueil authored Jul 16, 2020

The address of the DMA descriptor never changes. It can therefore be set
in the probe function.

v2-v3: No change
Signed-off-by: Paul Cercueil <paul@crapouillou.net>
Reviewed-by: Sam Ravnborg <sam@ravnborg.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20200716163846.174790-6-paul@crapouillou.net

e5507d2c

drm/ingenic: Add missing CR in debug strings · 1f7596f4

Paul Cercueil authored Jul 16, 2020

If you pass a string that is not terminated with a carriage return to
dev_err(), it will eventually be printed with a carriage return, but
not right away, since the kernel will wait for a pr_cont().

v2: New patch
v3: No change
Signed-off-by: Paul Cercueil <paul@crapouillou.net>
Reviewed-by: Sam Ravnborg <sam@ravnborg.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20200716163846.174790-5-paul@crapouillou.net

1f7596f4

drm/ingenic: Rename ingenic-drm.c to ingenic-drm-drv.c · 54fe8942

Paul Cercueil authored Jul 16, 2020

Full rename without any modification, except to the Makefile.

Renaming ingenic-drm.c to ingenic-drm-drv.c allow to decouple the module
name from the source file name in the Makefile. This will be useful
later when more source files are added.

v2: New patch
v3: No change
Signed-off-by: Paul Cercueil <paul@crapouillou.net>
Reviewed-by: Sam Ravnborg <sam@ravnborg.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20200716163846.174790-4-paul@crapouillou.net

54fe8942

dt-bindings: display: Add ingenic,ipu.yaml · ba8989a6

Paul Cercueil authored Jul 16, 2020

Add documentation of the Device Tree bindings for the Image Processing
Unit (IPU) found in most Ingenic SoCs.

v2: Add missing 'const' in items list
v3: No change
Signed-off-by: Paul Cercueil <paul@crapouillou.net>
Reviewed-by: Rob Herring <robh@kernel.org>
Reviewed-by: Sam Ravnborg <sam@ravnborg.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20200716163846.174790-3-paul@crapouillou.net

ba8989a6

dt-bindings: display: Convert ingenic,lcd.txt to YAML · c9390228

Paul Cercueil authored Jul 16, 2020

Convert the ingenic,lcd.txt to a new ingenic,lcd.yaml file.

In the process, the new ingenic,jz4780-lcd compatible string has been
added.

v2: Add info about IPU at port@8
v3: No change
Signed-off-by: Paul Cercueil <paul@crapouillou.net>
Reviewed-by: Rob Herring <robh@kernel.org>
Reviewed-by: Sam Ravnborg <sam@ravnborg.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20200716163846.174790-2-paul@crapouillou.net

c9390228

drm/ingenic: Fix incorrect assumption about plane->index · ca43f274

Paul Cercueil authored Jul 16, 2020

plane->index is NOT the index of the color plane in a YUV frame.
Actually, a YUV frame is represented by a single drm_plane, even though
it contains three Y, U, V planes.

v2-v3: No change

Cc: stable@vger.kernel.org # v5.3
Fixes: 90b86fcc ("DRM: Add KMS driver for the Ingenic JZ47xx SoCs")
Signed-off-by: Paul Cercueil <paul@crapouillou.net>
Reviewed-by: Sam Ravnborg <sam@ravnborg.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20200716163846.174790-1-paul@crapouillou.net

ca43f274

drm/nouveau/kms/nvd9-: Fix disabling CRCs alongside OR reprogramming · 2d786508

Lyude Paul authored Jun 29, 2020

While I had thought I'd tested this before, it looks like this one issue
slipped by my original CRC patches. Basically, there seem to be a few
rules we need to follow when sending CRC commands to the display
controller:

* CRCs cannot be both disabled and enabled for a single head in the same
  flush
* If a head with CRC reporting enabled switches from one OR to another,
  there must be a flush before the OR is re-enabled regardless of the
  final state of CRC reporting.

So, split nv50_crc_atomic_prepare_notifier_contexts() into two
functions:
* nv_crc_atomic_release_notifier_contexts() - checks whether the CRC
  notifier contexts were released successfully after the first flush
* nv_crc_atomic_init_notifier_contexts() - prepares any CRC notifier
  contexts for use before enabling reporting

Additionally, in order to force a flush when we re-assign ORs with heads
that have CRCs enabled we split our atomic check function into two:

* nv50_crc_atomic_check_head() - called from our heads' atomic checks,
  determines whether a state needs to set or clear CRC reporting
* nv50_crc_atomic_check_outp() - called at the end of the atomic check
  after all ORs have been added to the atomic state, and sets
  nv50_atom->flush_disable if needed
Signed-off-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Ben Skeggs <skeggsb@gmail.com>
Acked-by: Dave Airlie <airlied@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200629223635.103804-1-lyude@redhat.com

2d786508

drm/nouveau/kms/nvd9-: Add CRC support · 12885ecb

Lyude Paul authored Oct 07, 2019

This introduces support for CRC readback on gf119+, using the
documentation generously provided to us by Nvidia:

https://github.com/NVIDIA/open-gpu-doc/blob/master/Display-CRC/display-crc.txt

We expose all available CRC sources. SF, SOR, PIOR, and DAC are exposed
through a single set of "outp" sources: outp-active/auto for a CRC of
the scanout region, outp-complete for a CRC of both the scanout and
blanking/sync region combined, and outp-inactive for a CRC of only the
blanking/sync region. For each source, nouveau selects the appropriate
tap point based on the output path in use. We also expose an "rg"
source, which allows for capturing CRCs of the scanout raster before
it's encoded into a video signal in the output path. This tap point is
referred to as the raster generator.

Note that while there's some other neat features that can be used with
CRC capture on nvidia hardware, like capturing from two CRC sources
simultaneously, I couldn't see any usecase for them and did not
implement them.

Nvidia only allows for accessing CRCs through a shared DMA region that
we program through the core EVO/NvDisplay channel which is referred to
as the notifier context. The notifier context is limited to either 255
(for Fermi-Pascal) or 2047 (Volta+) entries to store CRCs in, and
unfortunately the hardware simply drops CRCs and reports an overflow
once all available entries in the notifier context are filled.

Since the DRM CRC API and igt-gpu-tools don't expect there to be a limit
on how many CRCs can be captured, we work around this in nouveau by
allocating two separate notifier contexts for each head instead of one.
We schedule a vblank worker ahead of time so that once we start getting
close to filling up all of the available entries in the notifier
context, we can swap the currently used notifier context out with
another pre-prepared notifier context in a manner similar to page
flipping.

Unfortunately, the hardware only allows us to this by flushing two
separate updates on the core channel: one to release the current
notifier context handle, and one to program the next notifier context's
handle. When the hardware processes the first update, the CRC for the
current frame is lost. However, the second update can be flushed
immediately without waiting for the first to complete so that CRC
generation resumes on the next frame. According to Nvidia's hardware
engineers, there isn't any cleaner way of flipping notifier contexts
that would avoid this.

Since using vblank workers to swap out the notifier context will ensure
we can usually flush both updates to hardware within the timespan of a
single frame, we can also ensure that there will only be exactly one
frame lost between the first and second update being executed by the
hardware. This gives us the guarantee that we're always correctly
matching each CRC entry with it's respective frame even after a context
flip. And since IGT will retrieve the CRC entry for a frame by waiting
until it receives a CRC for any subsequent frames, this doesn't cause an
issue with any tests and is much simpler than trying to change the
current DRM API to accommodate.

In order to facilitate testing of correct handling of this limitation,
we also expose a debugfs interface to manually control the threshold for
when we start trying to flip the notifier context. We will use this in
igt to trigger a context flip for testing purposes without needing to
wait for the notifier to completely fill up. This threshold is reset
to the default value set by nouveau after each capture, and is exposed
in a separate folder within each CRTC's debugfs directory labelled
"nv_crc".

Changes since v1:
* Forgot to finish saving crc.h before saving, whoops. This just adds
  some corrections to the empty function declarations that we use if
  CONFIG_DEBUG_FS isn't enabled.
Changes since v2:
* Don't check return code from debugfs_create_dir() or
  debugfs_create_file() - Greg K-H
Changes since v3:
  (no functional changes)
* Fix SPDX license identifiers (checkpatch)
* s/uint32_t/u32/ (checkpatch)
* Fix indenting in switch cases (checkpatch)
Changes since v4:
* Remove unneeded param changes with nv50_head_flush_clr/set
* Rebase
Changes since v5:
* Remove set but unused variable (outp) in nv50_crc_atomic_check() -
  Kbuild bot
Signed-off-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Ben Skeggs <bskeggs@redhat.com>
Acked-by: Dave Airlie <airlied@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200627194657.156514-10-lyude@redhat.com

12885ecb

drm/nouveau/kms/nv50-: Move hard-coded object handles into header · 0bc8ffe0

Lyude Paul authored Jan 21, 2020

While most of the functionality on Nvidia GPUs doesn't require using an
explicit handle instead of the main VRAM handle + offset, there are a
couple of places that do require explicit handles, such as CRC
functionality. Since this means we're about to add another
nouveau-chosen handle, let's just go ahead and move any hard-coded
handles into a single header. This is just to keep things slightly
organized, and to make it a little bit easier if we need to add more
handles in the future.

This patch should contain no functional changes.

Changes since v3:
* Correct SPDX license identifier (checkpatch)
Signed-off-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Ben Skeggs <bskeggs@redhat.com>
Acked-by: Dave Airlie <airlied@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200627194657.156514-9-lyude@redhat.com

0bc8ffe0

drm/nouveau/kms/nv50-: Expose nv50_outp_atom in disp.h · ebec8847

Lyude Paul authored Feb 07, 2020

In order to make sure that we flush disable updates at the right time
when disabling CRCs, we'll need to be able to look at the outp state to
see if we're changing it at the same time that we're disabling CRCs.

So, expose the struct in disp.h.
Signed-off-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Ben Skeggs <bskeggs@redhat.com>
Acked-by: Dave Airlie <airlied@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200627194657.156514-8-lyude@redhat.com

ebec8847

drm/nouveau/kms/nv140-: Track wndw mappings in nv50_head_atom · dbdaf719

Lyude Paul authored Feb 06, 2020

While we're not quite ready yet to add support for flexible wndw
mappings, we are going to need to at least keep track of the static wndw
mappings we're currently using in each head's atomic state. We'll likely
use this in the future to implement real flexible window mapping, but
the primary reason we'll need this is for CRC support.

See: on nvidia hardware, each CRC entry in the CRC notifier dma context
has a "tag". This tag corresponds to the nth update on a specific
EVO/NvDisplay channel, which itself is referred to as the "controlling
channel". For gf119+ this can be the core channel, ovly channel, or base
channel. Since we don't expose CRC entry tags to userspace, we simply
ignore this feature and always use the core channel as the controlling
channel. Simple.

Things get a little bit more complicated on gv100+ though. GV100+ only
lets us set the controlling channel to a specific wndw channel, and that
wndw must be owned by the head that we're grabbing CRCs when we enable
CRC generation. Thus, we always need to make sure that each atomic head
state has at least one wndw that is mapped to the head, which will be
used as the controlling channel.

Note that since we don't have flexible wndw mappings yet, we don't
expect to run into any scenarios yet where we'd have a head with no
mapped wndws. When we do add support for flexible wndw mappings however,
we'll need to make sure that we handle reprogramming CRC capture if our
controlling wndw is moved to another head (and potentially reject the
new head state entirely if we can't find another available wndw to
replace it).

With that being said, nouveau currently tracks wndw visibility on heads.
It does not keep track of the actual ownership mappings, which are
(currently) statically programmed. To fix this, we introduce another
bitmask into nv50_head_atom.wndw to keep track of ownership separately
from visibility. We then introduce a nv50_head callback to handle
populating the wndw ownership map, and call it during the atomic check
phase when core->assign_windows is set to true.
Signed-off-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Ben Skeggs <bskeggs@redhat.com>
Acked-by: Dave Airlie <airlied@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200627194657.156514-7-lyude@redhat.com

dbdaf719

drm/nouveau/kms/nv50-: Fix disabling dithering · fb2420b7

Lyude Paul authored Mar 17, 2020

While we expose the ability to turn off hardware dithering for nouveau,
we actually make the mistake of turning it on anyway, due to
dithering_depth containing a non-zero value if our dithering depth isn't
also set to 6 bpc.

So, fix it by never enabling dithering when it's disabled.
Signed-off-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Ben Skeggs <bskeggs@redhat.com>
Acked-by: Dave Airlie <airlied@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200627194657.156514-6-lyude@redhat.com

fb2420b7

drm/nouveau/kms/nv140-: Don't modify depth in state during atomic commit · 9c8e9b79

Lyude Paul authored Dec 13, 2019

Currently, we modify the depth value stored in the atomic state when
performing a commit in order to workaround the fact we haven't
implemented support for depths higher then 10 yet. This isn't idempotent
though, as it will happen every atomic commit where we modify the OR
state even if the head's depth in the atomic state hasn't been modified.

Normally this wouldn't matter, since we don't modify OR state outside of
modesets, but since the CRC capture region is implemented as part of the
OR state in hardware we'll want to make sure all commits modifying OR
state are idempotent so as to avoid changing the depth unexpectedly.

So, fix this by simply not writing the reduced depth value we come up
with to the atomic state.
Signed-off-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Ben Skeggs <bskeggs@redhat.com>
Acked-by: Dave Airlie <airlied@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200627194657.156514-5-lyude@redhat.com

9c8e9b79