Commits · be5df20a34d78d986ddfb6e0fc87e4fa4d05268b · nexedi / linux

22 Mar, 2017 1 commit

Merge tag 'drm-intel-next-2017-03-20' of git://anongit.freedesktop.org/git/drm-intel into drm-next · be5df20a

Dave Airlie authored Mar 23, 2017

More in i915 for 4.12:

- designware i2c fixes from Hans de Goede, in a topic branch shared
  with other subsystems (maybe, they didn't confirm, but requested the
  pull)
- drop drm_panel usage from the intel dsi vbt panel (Jani)
- vblank evasion improvements and tracing (Maarten and Ville)
- clarify spinlock irq semantics again a bit (Tvrtko)
- new ->pwrite backend hook (right now just for shmem pageche writes),
  from Chris
- more planar/ccs work from Ville
- hotplug safe connector iterators everywhere
- userptr fixes (Chris)
- selftests for cache coloring eviction (Matthew Auld)
- extend debugfs drop_caches interface for shrinker testing (Chris)
- baytrail "the rps kills the machine" fix (Chris)
- use new atomic state iterators, a lot (Maarten)
- refactor guc/huc code some (Arkadiusz Hiler)
- tighten breadcrumbs rbtree a bit (Chris)
- improve wrap-around and time handling in rps residency counters
  (Mika)
- split reset-in-progress in two flags, backoff and handoff (Chris)
- other misc reset improvements from a few people
- bunch of vgpu interaction fixes with recent code changes
- misc stuff all over, as usual

* tag 'drm-intel-next-2017-03-20' of git://anongit.freedesktop.org/git/drm-intel: (144 commits)
  drm/i915: Update DRIVER_DATE to 20170320
  drm/i915: Initialise i915_gem_object_create_from_data() directly
  drm/i915: Correct error handling for i915_gem_object_create_from_data()
  drm/i915: i915_gem_object_create_from_data() doesn't require struct_mutex
  drm/i915: Retire an active batch pool object rather than allocate new
  drm/i915: Add i810/i815 pci-ids for completeness
  drm/i915: Skip execlists_dequeue() early if the list is empty
  drm/i915: Stop using obj->obj_exec_link outside of execbuf
  drm/i915: Squelch WARN for VLV_COUNTER_CONTROL
  drm/i915/glk: Enable pooled EUs for Geminilake
  drm/i915: Remove superfluous i915_add_request_no_flush() helper
  drm/i915/vgpu: Neuter forcewakes for VGPU more thoroughly
  drm/i915: Fix vGPU balloon for ggtt guard page
  drm/i915: Avoid use-after-free of ctx in request tracepoints
  drm/i915: Assert that the context pin_counts do not overflow
  drm/i915: Wait for reset to complete before returning from debugfs/i915_wedged
  drm/i915: Restore engine->submit_request before unwedging
  drm/i915: Move engine->submit_request selection to a vfunc
  drm/i915: Split I915_RESET_IN_PROGRESS into two flags
  drm/i915: make context status notifier head be per engine
  ...

be5df20a

20 Mar, 2017 2 commits

drm/i915: Update DRIVER_DATE to 20170320 · c5bd2e14
Daniel Vetter authored Mar 20, 2017
```
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
```
c5bd2e14

Merge tag 'imx-drm-next-2017-03-17' of git://git.pengutronix.de/git/pza/linux into drm-next · 33d5f513

Dave Airlie authored Mar 20, 2017

imx-drm PRE/PRG support, deferred plane disabling, separate alpha support

- Initial support for the Prefetch Resolve Engine/Gasket on i.MX6QP,
  improving linear scanout buffer memory bandwidth utilization. This
  will in the future grow reordering support and allow direct scanout
  of Vivante tiled renderbuffers from the GPU.
- Deferred plane disabling gets rid of some busy waiting in the atomic
  plane disable and crtc disable paths that lead to wait_for_vblank
  timeouts.
- Add support for RGBA formats with a separate alpha plane, that can
  reduce memory bandwidth utilization for mostly transparent overlay
  planes by skipping color reads for completely transparent regions.
- Allow moving an active overlay plane without enforcing a modeset.
- Add 8-bit and 16-bit bayer formats to ipu_cpmem_set_image.
- Set the base address in ipu_cpmem_set_image even for invalid formats
  to increase robustness against errors.
- Use drm_plane_helper_check_state in plane atomic_check.
- Some cleanup.

* tag 'imx-drm-next-2017-03-17' of git://git.pengutronix.de/git/pza/linux: (22 commits)
  drm/imx: Remove unneeded definition for structure imx_drm_component
  drm/imx: use PRG/PRE when possible
  drm/imx: enable/disable PRG on CRTC enable/disable
  gpu: ipu-v3: only set non-zero AXI ID for IC when PRG is absent
  gpu: ipu-v3: hook up PRG unit
  gpu: ipu-v3: document valid IPUv3 compatibles and extend for i.MX6 QuadPlus
  gpu: ipu-v3: add driver for Prefetch Resolve Gasket
  gpu: ipu-v3: add DT binding for the Prefetch Resolve Gasket
  gpu: ipu-v3: add driver for Prefetch Resolve Engine
  gpu: ipu-v3: add DT binding for the Prefetch Resolve Engine
  drm/imx: ipuv3-plane: add support for separate alpha planes
  drm/imx: extend drm_plane_state_to_eba for separate channel support
  gpu: ipu-v3: add support for separate alpha channels
  drm: add RGB formats with separate alpha plane
  drm/imx: add deferred plane disabling
  drm/imx: don't wait for vblank and stop calling cleanup_planes in commit_tail
  gpu: ipu-v3: add unsynchronised DP channel disabling
  gpu: ipu-v3: remove IRQ dance on DC channel disable
  gpu: ipu-cpmem: add bayer formats to ipu_cpmem_set_image
  gpu: ipu-cpmem: set image base address even for incorrect formats
  ...

33d5f513

17 Mar, 2017 15 commits

drm/i915: Initialise i915_gem_object_create_from_data() directly · be062fa4

Chris Wilson authored Mar 17, 2017

Use pagecache_write to avoid shmemfs clearing the pages prior to us
immediately overwriting them with our data.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/20170317194648.12468-2-chris@chris-wilson.co.ukReviewed-by: Matthew Auld <matthew.auld@intel.com>

be062fa4

drm/i915: Correct error handling for i915_gem_object_create_from_data() · f3ddd2c1

Chris Wilson authored Mar 17, 2017

i915_gem_object_create_from_data() always returns an error pointer on
failure, there is no need to check against NULL.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/20170317205317.7885-1-chris@chris-wilson.co.ukReviewed-by: Matthew Auld <matthew.auld@intel.com>

f3ddd2c1

drm/i915: i915_gem_object_create_from_data() doesn't require struct_mutex · ce8ff099

Chris Wilson authored Mar 17, 2017

Both object creation and backing storage page allocation do not require
struct_mutex, so do not require the caller to take it.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/20170317194648.12468-1-chris@chris-wilson.co.ukReviewed-by: Matthew Auld <matthew.auld@intel.com>

ce8ff099

drm/i915: Retire an active batch pool object rather than allocate new · 51a575d9

Chris Wilson authored Mar 16, 2017

Since obj->active_count is only updated upon retirement, if we see an
active object in the batch pool, double check that is still active
before deciding to allocate a new object.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/20170316132006.7976-3-chris@chris-wilson.co.ukReviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

51a575d9

drm/i915: Add i810/i815 pci-ids for completeness · 92a0256e

Chris Wilson authored Mar 13, 2017

To improve our historical record and to simplify userspace that wants to
include i915_pciids.h as its canonical breakdown of PCI IDs and their
respective generations, include the gen1 ids for i810 and i815.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/20170313112810.4202-1-chris@chris-wilson.co.ukAcked-by: Jani Nikula <jani.nikula@intel.com>

92a0256e

drm/i915: Skip execlists_dequeue() early if the list is empty · 6c943de6

Chris Wilson authored Mar 17, 2017

Do an early read of the execlists' queue before we take the spinlock and
start checking. This is safe as the first writer to the execlists queue
will cause the tasklet to be run again after a memory barrier.

v2: Keep guc in sync with execlists queue changes
v3: Explain the mb between the tasklet running on one cpu and the
execlist_first update and schedule from a second cpu.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170317120716.17191-1-chris@chris-wilson.co.uk

6c943de6

drm/i915: Stop using obj->obj_exec_link outside of execbuf · e637d2cb

Chris Wilson authored Mar 16, 2017

i915_gem_stolen_list_info() sneakily takes advantage of the
obj->obj_exec_link to save itself from having to allocate. Enough of the
subterfuge, just allocate an array of pointers and sort them instead of
the list.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170316132006.7976-7-chris@chris-wilson.co.uk

e637d2cb

drm/i915: Squelch WARN for VLV_COUNTER_CONTROL · facbecad

Chris Wilson authored Mar 17, 2017

Before rc6 is initialised (after driver load or resume), the value inside
VLV_COUNTER_CONTROL is undefined so we cannot make an assertion that is
in HIGH_RANGE mode.

Fixes: 6b7f6aa7 ("drm/i915: Use coarse grained residency counter with byt")
Testcase: igt/drv_suspend/debugfs-reader
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170317125918.11351-1-chris@chris-wilson.co.ukReviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>

facbecad

drm/i915/glk: Enable pooled EUs for Geminilake · 234516af

Ander Conselvan de Oliveira authored Mar 17, 2017

Geminilake also supports pooled EUs. Enable it.

It is unclear if the recommendation to disable it for 2x6 configurations
from commit e015dd69 ("drm/i915/bxt: Add WaEnablePooledEuFor2x6")
should also apply to GLK, but it is applied anyway to be on the safe
side. That restriction can be lifted later if determined not to impact
performance.

The extra restriction should not impact user space either. The only user
space that uses this feature is Beignet, and it only does so for 3x6
devices. See See Beignet's commit 6901899ec90a ("Runtime: set the sub
slice according to kernel pooled EU configure.").

v2: Improve commit message. (Mika, Roy)

Cc: Arun Siluvery <arun.siluvery@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Yang Rong <rong.r.yang@intel.com>
Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170317140436.24645-1-ander.conselvan.de.oliveira@intel.com

234516af

drm/i915: Remove superfluous i915_add_request_no_flush() helper · e642c85b

Chris Wilson authored Mar 17, 2017

The only time we need to emit a flush inside request emission is after
an execbuffer, for which we can use the full __i915_add_request(). All
other instances want the simpler i915_add_request() without flushing, so
remove the useless helper.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170317114709.8388-1-chris@chris-wilson.co.uk

e642c85b

drm/i915/vgpu: Neuter forcewakes for VGPU more thoroughly · e3b1895f

Tvrtko Ursulin authored Mar 10, 2017

If we avoid initializing forcewake domains when running as
a guest, and also use gen2 mmio accessors in that case, we
can avoid the timer traffic and any looping through the
forcewake code which is currently just so it can end up in
the no-op forcewake implementation.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Weinan Li <weinan.z.li@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Tested-by: Terrence Xu <terrence.xu@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170310095747.12258-1-tvrtko.ursulin@linux.intel.com
[tursulin: commit spelling fix]

e3b1895f

drm/i915: Fix vGPU balloon for ggtt guard page · fa7e8b55

Zhenyu Wang authored Mar 10, 2017

From commit a6508ded ("drm/i915: Use page coloring to provide the guard
page at the end of the GTT"), we no longer explicitly subtract guard page
at end for GGTT address space init, so shouldn't subtract that for vGPU
balloon too, as that will leave that end page to be available for
vGPU. Change balloon to cover full range too.

This fixes to use recent drm-intel tip kernel for guest OS. Found by GVT-g
cmd parser that guest kernel uses end page as scratch then try to run
MI_STORE_REG_MEM onto it.

v2: remove old comments

Cc: Terrence Xu <terrence.xu@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170310022238.3191-1-zhenyuw@linux.intel.comReviewed-by: Chris Wilson <chris@chris-wilson.co.uk>

fa7e8b55

drm/i915: Avoid use-after-free of ctx in request tracepoints · 60367132

Chris Wilson authored Mar 16, 2017

trace_i915_gem_request_out may be used after the request is completed,
and so the request may have been retired on another thread, invalidating
the rq->ctx. Avoid dereferencing rq->ctx in the tracepoint by switching
to the fence context id instead, updating all tracepoints to match.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170316204235.27786-1-chris@chris-wilson.co.ukReviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

60367132

drm/nouveau/secboot: fix NULL pointer dereference · b7d6c8db

Alexandre Courbot authored Mar 10, 2017

The msgqueue pointer validity should be checked by its owner, not by the
msgqueue code itself to avoid this situation.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Reported-by: Julia Lawall <julia.lawall@lip6.fr>
Signed-off-by: Dave Airlie <airlied@redhat.com>

b7d6c8db

drm/nouveau/secboot: fix inconsistent pointer checking · aa7fc0ca

Alexandre Courbot authored Mar 15, 2017

We were returning PTR_ERR() on a NULL pointer, which obviously won't
work. nvkm_engine_ref() will return an error in case something went
wrong.
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>

aa7fc0ca

16 Mar, 2017 22 commits

drm/i915: Assert that the context pin_counts do not overflow · a533b4ba

Chris Wilson authored Mar 16, 2017

This should be impossible, but let's assert that we do not pin a context
4 billion times before retiring!

v2: Fix the assertion -- the patch had just one job to do!
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170316171628.3228-1-chris@chris-wilson.co.ukReviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>

a533b4ba

drm/i915: Wait for reset to complete before returning from debugfs/i915_wedged · d3df42b7

Chris Wilson authored Mar 16, 2017

Provide some serialisation between user operations by waiting for the
reset initiated by setting i915_wedged to complete.

The automatic wait here makes
        echo 1 > i915_wedged; cat i915_error_state
do the right thing, and not risk reporting "No error collected".
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170316171305.12972-4-chris@chris-wilson.co.uk

d3df42b7

drm/i915: Restore engine->submit_request before unwedging · 2e8f9d32

Chris Wilson authored Mar 16, 2017

When we wedge the device, we override engine->submit_request with a nop
to ensure that all in-flight requests are marked in error. However, igt
would like to unwedge the device to test -EIO handling. This requires us
to flush those in-flight requests and restore the original
engine->submit_request.

v2: Use a vfunc to unify enabling request submission to engines
v3: Split new vfunc to a separate patch.
v4: Make the wait interruptible -- the third party fences we wait upon
may be indefinitely broken, so allow the reset to be aborted.

Fixes: 821ed7df ("drm/i915: Update reset path to fix incomplete requests")
Testcase: igt/gem_eio
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> #v3
Link: http://patchwork.freedesktop.org/patch/msgid/20170316171305.12972-3-chris@chris-wilson.co.uk

2e8f9d32

drm/i915: Move engine->submit_request selection to a vfunc · ff44ad51

Chris Wilson authored Mar 16, 2017

It turns out that we may want to restore the original
engine->submit_request (and engine->schedule) callbacks from more than
just the guc <-> execlists transition. Move this to a vfunc so we can
have a common interface.

v2: Move initial selection to intel_engines_init_common(), repaint vfunc
with engine->set_default_submission (and a similar colour for the
helper).
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170316171305.12972-2-chris@chris-wilson.co.uk

ff44ad51

drm/i915: Split I915_RESET_IN_PROGRESS into two flags · 8c185eca

Chris Wilson authored Mar 16, 2017

I915_RESET_IN_PROGRESS is being used for both signaling the requirement
to i915_mutex_lock_interruptible() to avoid taking the struct_mutex and
to instruct a waiter (already holding the struct_mutex) to perform the
reset. To allow for a little more coordination, split these two meaning
into a couple of distinct flags. I915_RESET_BACKOFF tells
i915_mutex_lock_interruptible() not to acquire the mutex and
I915_RESET_HANDOFF tells the waiter to call i915_reset().
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170316171305.12972-1-chris@chris-wilson.co.uk

8c185eca

drm/i915: make context status notifier head be per engine · 3fc03069

Changbin Du authored Mar 13, 2017

GVTg has introduced the context status notifier to schedule the GVTg
workload. At that time, the notifier is bound to GVTg context only,
so GVTg is not aware of host workloads.

Now we are going to improve GVTg's guest workload scheduler policy,
and add Guc emulation support for new Gen graphics. Both these two
features require acknowledgment for all contexts running on hardware.
(But will not alter host workload.) So here try to make some change.

The change is simple:
  1. Move the context status notifier head from i915_gem_context to
     intel_engine_cs. Which means there is a notifier head per engine
     instead of per context. Execlist driver still call notifier for
     each context sched-in/out events of current engine.
  2. At GVTg side, it binds a notifier_block for each physical engine
     at GVTg initialization period. Then GVTg can hear all context
     status events.

In this patch, GVTg do nothing for host context event, but later
will add a function there. But in any case, the notifier callback is
a noop if this is no active vGPU.

Since intel_gvt_init() is called at early initialization stage and
require the status notifier head has been initiated, I initiate it in
intel_engine_setup().

v2: remove a redundant newline. (chris)

Fixes: 3c7ba635 ("drm/i915: Introduce execlist context status change notification")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100232Signed-off-by: Changbin Du <changbin.du@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Zhi Wang <zhi.a.wang@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/20170313024711.28591-1-changbin.du@intel.comAcked-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

3fc03069

drm/i915/scheduler: emulate a scheduler for guc · 31de7350

Chris Wilson authored Mar 16, 2017

This emulates execlists on top of the GuC in order to defer submission of
requests to the hardware. This deferral allows time for high priority
requests to gazump their way to the head of the queue, however it nerfs
the GuC by converting it back into a simple execlist (where the CPU has
to wake up after every request to feed new commands into the GuC).

v2: Drop hack status - though iirc there is still a lockdep inversion
between fence and engine->timeline->lock (which is impossible as the
nesting only occurs on different fences - hopefully just requires some
judicious lockdep annotation)
v3: Apply lockdep nesting to enabling signaling on the request, using
the pattern we already have in __i915_gem_request_submit();
v4: Replaying requests after a hang also now needs the timeline
spinlock, to disable the interrupts at least
v5: Hold wq lock for completeness, and emit a tracepoint for enabling signal
v6: Reorder interrupt checking for a happier gcc.
v7: Only signal the tasklet after a user-interrupt if using guc scheduling
v8: Restore lost update of rq through the i915_guc_irq_handler (Tvrtko)
v9: Avoid re-initialising the engine->irq_tasklet from inside a reset
v10: Hook up the execlists-style tracepoints
v11: Clear the execlists irq_posted bit after taking over the interrupt/tasklet
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170316125619.6856-1-chris@chris-wilson.co.ukReviewed-by: Michał Winiarski <michal.winiarski@intel.com>

31de7350

drm/i915: Replace irq_seqno_barrier on hws write with a clflush · 14a6bbf9

Chris Wilson authored Mar 14, 2017

When manually overwriting the HWS, rather than assume irq_seqno_barrier
does the right thing, we can explicitly flush the cacheline instead.
This avoids us calling the engine->irq_seqno_barrier() from an illegal
context:

[ 1472.651797] BUG: scheduling while atomic: migration/0/11/0x00000002
[ 1472.651807] Modules linked in: ctr ccm arc4 snd_hda_codec_hdmi bnep rfcomm iwldvm snd_hda_codec_conexant snd_hda_codec_generic snd_hda_intel mac80211 snd_hda_codec snd_hda_core snd_pcm dm_multipath snd_hwdep intel_powerclamp coretemp snd_seq_midi crct10dif_pclmul snd_seq_midi_event crc32_pclmul iwlwifi ghash_clmulni_intel btusb snd_rawmidi btrtl aesni_intel btbcm aes_x86_64 crypto_simd btintel cryptd glue_helper bluetooth snd_seq cfg80211 snd_timer snd_seq_device intel_ips binfmt_misc snd mei_me soundcore mei dm_mirror dm_region_hash dm_log i915 intel_gtt i2c_algo_bit drm_kms_helper cfbfillrect syscopyarea cfbimgblt sysfillrect sysimgblt fb_sys_fops cfbcopyarea prime_numbers e1000e drm ahci libahci
[ 1472.651897] CPU: 0 PID: 11 Comm: migration/0 Tainted: G     U          4.11.0-rc1+ #203
[ 1472.651899] Hardware name: LENOVO 514328U/514328U, BIOS 6QET44WW (1.14 ) 04/20/2010
[ 1472.651900] Call Trace:
[ 1472.651913]  dump_stack+0x63/0x90
[ 1472.651922]  __schedule_bug+0x5d/0x6b
[ 1472.651930]  __schedule+0x46a/0x5f0
[ 1472.651934]  schedule+0x38/0x90
[ 1472.651938]  schedule_hrtimeout_range_clock+0x85/0x110
[ 1472.651945]  ? hrtimer_init+0x10/0x10
[ 1472.651949]  schedule_hrtimeout_range+0xe/0x10
[ 1472.651952]  usleep_range+0x4d/0x60
[ 1472.652037]  gen5_seqno_barrier+0x13/0x20 [i915]
[ 1472.652101]  intel_engine_init_global_seqno+0xd7/0x160 [i915]
[ 1472.652160]  __i915_gem_set_wedged_BKL+0xa0/0x180 [i915]
[ 1472.652166]  multi_cpu_stop+0xbb/0xe0
[ 1472.652170]  ? cpu_stop_queue_work+0x90/0x90
[ 1472.652174]  cpu_stopper_thread+0x82/0x110
[ 1472.652179]  smpboot_thread_fn+0x137/0x190
[ 1472.652184]  kthread+0xf7/0x130
[ 1472.652187]  ? sort_range+0x20/0x20
[ 1472.652191]  ? kthread_park+0x90/0x90
[ 1472.652195]  ret_from_fork+0x2c/0x40

Testcase: igt/gem_eio #ilk
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/20170314111452.9375-1-chris@chris-wilson.co.ukReviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>

14a6bbf9

drm/i915: Use coarse grained residency counter with byt · 6b7f6aa7

Mika Kuoppala authored Mar 15, 2017

Set byt rc residency counters high level as chv does by
default. We lose some accuracy on byt but we can do the calculation
without extra hw read on both platforms, as now they behave
identically in this respect.

v2: use ktime
v3: keep comparison u32 (Chris)

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1489592584-10422-1-git-send-email-mika.kuoppala@intel.com

6b7f6aa7

drm/i915: Use ktime to calculate rc0 residency · 679cb6c1

Mika Kuoppala authored Mar 15, 2017

We have used cz timestamp register to gain a reference time wrt
to residency calculations. The residency counts are in cz clk ticks
(333Mhz clock) but for some reason the cz timestamp register gives
100us units. Perhaps for some other usage, the base-ten based values
are easier, but in residency calculations raw units would have been
the easiest.

As there is not much advantage of using base-ten clock through
a more costly punit access, take our reference times directly from
kernel clock.

v2: use ktime (Chris, Ville)

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>

679cb6c1

drm/i915: Convert debugfs to use generic residency calculator · 1362877e

Mika Kuoppala authored Mar 15, 2017

Use intel_rc6_residency to get benefit for increased resolution
in byt/chv.

v2: output raw and time (Chris)
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>

1362877e

drm/i915: Extend vlv/chv residency resolution · 47c21d9a

Mika Kuoppala authored Mar 15, 2017

Vlv and chv residency counters are 40 bits in width.
With a control bit, we can choose between upper or lower
32 bit window into this counter.

Lets toggle this bit on and off on and read both parts.
As a result we can push the wrap from 13 seconds to 54
minutes.

v2: commit msg, loop readability, goto elimination (Chris)
v3: bug ref, divide outside runtime pm lock (Chris)

References: https://bugs.freedesktop.org/show_bug.cgi?id=94852Reported-by: Len Brown <len.brown@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>

47c21d9a

drm/i915: Return residency as microseconds · c5a0ad11

Mika Kuoppala authored Mar 15, 2017

Change the granularity from milliseconds to microseconds
when returning rc6 residencies. This is in preparation
for increased resolution on some platforms.

v2: use 64bit div macro (Chris)

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>

c5a0ad11

drm/i915: Move residency calculation into intel_pm.c · 135bafa5

Mika Kuoppala authored Mar 15, 2017

Plan is to make generic residency calculation utility
function for usage outside of sysfs. As a first step
move residency calculation into intel_pm.c
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>

135bafa5

drm/i915/userptr: Reinvent GGTT self-faulting protection · 15c344f4

Chris Wilson authored Mar 15, 2017

lockdep doesn't like us taking the mm->mmap_sem inside the get_pages
callback for a couple of reasons. The straightforward deadlock:

[13755.434059] =============================================
[13755.434061] [ INFO: possible recursive locking detected ]
[13755.434064] 4.11.0-rc1-CI-CI_DRM_297+ #1 Tainted: G     U
[13755.434066] ---------------------------------------------
[13755.434068] gem_userptr_bli/8398 is trying to acquire lock:
[13755.434070]  (&mm->mmap_sem){++++++}, at: [<ffffffffa00c988a>] i915_gem_userptr_get_pages+0x5a/0x2e0 [i915]
[13755.434096]
               but task is already holding lock:
[13755.434098]  (&mm->mmap_sem){++++++}, at: [<ffffffff8104d485>] __do_page_fault+0x105/0x560
[13755.434105]
               other info that might help us debug this:
[13755.434108]  Possible unsafe locking scenario:

[13755.434110]        CPU0
[13755.434111]        ----
[13755.434112]   lock(&mm->mmap_sem);
[13755.434115]   lock(&mm->mmap_sem);
[13755.434117]
                *** DEADLOCK ***

[13755.434121]  May be due to missing lock nesting notation

[13755.434126] 2 locks held by gem_userptr_bli/8398:
[13755.434128]  #0:  (&mm->mmap_sem){++++++}, at: [<ffffffff8104d485>] __do_page_fault+0x105/0x560
[13755.434135]  #1:  (&obj->mm.lock){+.+.+.}, at: [<ffffffffa00b887d>] __i915_gem_object_get_pages+0x1d/0x70 [i915]
[13755.434156]
               stack backtrace:
[13755.434161] CPU: 3 PID: 8398 Comm: gem_userptr_bli Tainted: G     U          4.11.0-rc1-CI-CI_DRM_297+ #1
[13755.434165] Hardware name: GIGABYTE GB-BKi7(H)A-7500/MFLP7AP-00, BIOS F4 02/20/2017
[13755.434169] Call Trace:
[13755.434174]  dump_stack+0x67/0x92
[13755.434178]  __lock_acquire+0x133a/0x1b50
[13755.434182]  lock_acquire+0xc9/0x220
[13755.434200]  ? i915_gem_userptr_get_pages+0x5a/0x2e0 [i915]
[13755.434204]  down_read+0x42/0x70
[13755.434221]  ? i915_gem_userptr_get_pages+0x5a/0x2e0 [i915]
[13755.434238]  i915_gem_userptr_get_pages+0x5a/0x2e0 [i915]
[13755.434255]  ____i915_gem_object_get_pages+0x25/0x60 [i915]
[13755.434272]  __i915_gem_object_get_pages+0x59/0x70 [i915]
[13755.434288]  i915_gem_fault+0x397/0x6a0 [i915]
[13755.434304]  ? i915_gem_fault+0x1a1/0x6a0 [i915]
[13755.434308]  ? __lock_acquire+0x449/0x1b50
[13755.434311]  ? __lock_acquire+0x449/0x1b50
[13755.434315]  ? vm_mmap_pgoff+0xa9/0xd0
[13755.434318]  __do_fault+0x19/0x70
[13755.434321]  __handle_mm_fault+0x863/0xe50
[13755.434325]  handle_mm_fault+0x17f/0x370
[13755.434329]  ? handle_mm_fault+0x40/0x370
[13755.434332]  __do_page_fault+0x279/0x560
[13755.434336]  do_page_fault+0xc/0x10
[13755.434339]  page_fault+0x22/0x30
[13755.434342] RIP: 0033:0x7f5ab91b5880
[13755.434345] RSP: 002b:00007fff62922218 EFLAGS: 00010216
[13755.434348] RAX: 0000000000b74500 RBX: 00007f5ab7f81000 RCX: 0000000000000000
[13755.434352] RDX: 0000000000100000 RSI: 00007f5ab7f81000 RDI: 00007f5aba61c000
[13755.434355] RBP: 00007f5aba61c000 R08: 0000000000000007 R09: 0000000100000000
[13755.434359] R10: 000000000000037d R11: 00007f5ab91b5840 R12: 0000000000000001
[13755.434362] R13: 0000000000000005 R14: 0000000000000001 R15: 0000000000000000

and cyclic deadlocks:

[ 2566.458979] ======================================================
[ 2566.459054] [ INFO: possible circular locking dependency detected ]
[ 2566.459127] 4.11.0-rc1+ #26 Not tainted
[ 2566.459194] -------------------------------------------------------
[ 2566.459266] gem_streaming_w/759 is trying to acquire lock:
[ 2566.459334]  (&obj->mm.lock){+.+.+.}, at: [<ffffffffa034bc80>] i915_gem_object_pin_pages+0x0/0xc0 [i915]
[ 2566.459605]
[ 2566.459605] but task is already holding lock:
[ 2566.459699]  (&mm->mmap_sem){++++++}, at: [<ffffffff8106fd11>] __do_page_fault+0x121/0x500
[ 2566.459814]
[ 2566.459814] which lock already depends on the new lock.
[ 2566.459814]
[ 2566.459934]
[ 2566.459934] the existing dependency chain (in reverse order) is:
[ 2566.460030]
[ 2566.460030] -> #1 (&mm->mmap_sem){++++++}:
[ 2566.460139]        lock_acquire+0xfe/0x220
[ 2566.460214]        down_read+0x4e/0x90
[ 2566.460444]        i915_gem_userptr_get_pages+0x6e/0x340 [i915]
[ 2566.460669]        ____i915_gem_object_get_pages+0x8b/0xd0 [i915]
[ 2566.460900]        __i915_gem_object_get_pages+0x6a/0x80 [i915]
[ 2566.461132]        __i915_vma_do_pin+0x7fa/0x930 [i915]
[ 2566.461352]        eb_add_vma+0x67b/0x830 [i915]
[ 2566.461572]        eb_lookup_vmas+0xafe/0x1010 [i915]
[ 2566.461792]        i915_gem_do_execbuffer+0x715/0x2870 [i915]
[ 2566.462012]        i915_gem_execbuffer2+0x106/0x2b0 [i915]
[ 2566.462152]        drm_ioctl+0x36c/0x670 [drm]
[ 2566.462236]        do_vfs_ioctl+0x12c/0xa60
[ 2566.462317]        SyS_ioctl+0x41/0x70
[ 2566.462399]        entry_SYSCALL_64_fastpath+0x1c/0xb1
[ 2566.462477]
[ 2566.462477] -> #0 (&obj->mm.lock){+.+.+.}:
[ 2566.462587]        __lock_acquire+0x1602/0x1790
[ 2566.462661]        lock_acquire+0xfe/0x220
[ 2566.462893]        i915_gem_object_pin_pages+0x4c/0xc0 [i915]
[ 2566.463116]        i915_gem_fault+0x2c2/0x8c0 [i915]
[ 2566.463197]        __do_fault+0x42/0x130
[ 2566.463276]        __handle_mm_fault+0x92c/0x1280
[ 2566.463356]        handle_mm_fault+0x1e2/0x440
[ 2566.463443]        __do_page_fault+0x1c4/0x500
[ 2566.463529]        do_page_fault+0xc/0x10
[ 2566.463613]        page_fault+0x1f/0x30
[ 2566.463693]
[ 2566.463693] other info that might help us debug this:
[ 2566.463693]
[ 2566.463820]  Possible unsafe locking scenario:
[ 2566.463820]
[ 2566.463918]        CPU0                    CPU1
[ 2566.463988]        ----                    ----
[ 2566.464068]   lock(&mm->mmap_sem);
[ 2566.464143]                                lock(&obj->mm.lock);
[ 2566.464226]                                lock(&mm->mmap_sem);
[ 2566.464304]   lock(&obj->mm.lock);
[ 2566.464378]
[ 2566.464378]  *** DEADLOCK ***
[ 2566.464378]
[ 2566.464504] 1 lock held by gem_streaming_w/759:
[ 2566.464576]  #0:  (&mm->mmap_sem){++++++}, at: [<ffffffff8106fd11>] __do_page_fault+0x121/0x500
[ 2566.464699]
[ 2566.464699] stack backtrace:
[ 2566.464801] CPU: 0 PID: 759 Comm: gem_streaming_w Not tainted 4.11.0-rc1+ #26
[ 2566.464881] Hardware name: GIGABYTE GB-BXBT-1900/MZBAYAB-00, BIOS F8 03/02/2016
[ 2566.464983] Call Trace:
[ 2566.465061]  dump_stack+0x68/0x9f
[ 2566.465144]  print_circular_bug+0x20b/0x260
[ 2566.465234]  __lock_acquire+0x1602/0x1790
[ 2566.465323]  ? debug_check_no_locks_freed+0x1a0/0x1a0
[ 2566.465564]  ? i915_gem_object_wait+0x238/0x650 [i915]
[ 2566.465657]  ? debug_lockdep_rcu_enabled.part.4+0x1a/0x30
[ 2566.465749]  lock_acquire+0xfe/0x220
[ 2566.465985]  ? i915_sg_trim+0x1b0/0x1b0 [i915]
[ 2566.466223]  i915_gem_object_pin_pages+0x4c/0xc0 [i915]
[ 2566.466461]  ? i915_sg_trim+0x1b0/0x1b0 [i915]
[ 2566.466699]  i915_gem_fault+0x2c2/0x8c0 [i915]
[ 2566.466939]  ? i915_gem_pwrite_ioctl+0xce0/0xce0 [i915]
[ 2566.467030]  ? __lock_acquire+0x642/0x1790
[ 2566.467122]  ? __lock_acquire+0x642/0x1790
[ 2566.467209]  ? debug_lockdep_rcu_enabled+0x35/0x40
[ 2566.467299]  ? get_unmapped_area+0x1b4/0x1d0
[ 2566.467387]  __do_fault+0x42/0x130
[ 2566.467474]  __handle_mm_fault+0x92c/0x1280
[ 2566.467564]  ? __pmd_alloc+0x1e0/0x1e0
[ 2566.467651]  ? vm_mmap_pgoff+0x160/0x190
[ 2566.467740]  ? handle_mm_fault+0x111/0x440
[ 2566.467827]  handle_mm_fault+0x1e2/0x440
[ 2566.467914]  ? handle_mm_fault+0x5d/0x440
[ 2566.468002]  __do_page_fault+0x1c4/0x500
[ 2566.468090]  do_page_fault+0xc/0x10
[ 2566.468180]  page_fault+0x1f/0x30
[ 2566.468263] RIP: 0033:0x557895ced32a
[ 2566.468337] RSP: 002b:00007fffd6dd8a10 EFLAGS: 00010202
[ 2566.468419] RAX: 00007f659a4db000 RBX: 0000000000000003 RCX: 00007f659ad032da
[ 2566.468501] RDX: 0000000000000000 RSI: 0000000000100000 RDI: 0000000000000000
[ 2566.468586] RBP: 0000000000000007 R08: 0000000000000003 R09: 0000000100000000
[ 2566.468667] R10: 0000000000000001 R11: 0000000000000246 R12: 0000557895ceda60
[ 2566.468749] R13: 0000000000000001 R14: 00007fffd6dd8ac0 R15: 00007f659a4db000

By checking the status of the gup worker (serialized by the
obj->mm.lock) we can determine whether it is still active, has failed or
has succeeded. If the worker is still active (or failed), we know that
it cannot be bound and so we can skip taking struct_mutex (risking
potential recursion). As we check the worker status, we mark it to
discard any partial results, forcing us to restart on the next
get_pages.
Reported-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Fixes: 1c8782dd ("drm/i915/userptr: Disallow wrapping GTT into a userptr")
Testcase: igt/gem_userptr_blits/map-fixed-invalidate-gup
Testcase: igt/gem_userptr_blits/dmabuf-sync
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170315140150.19432-1-chris@chris-wilson.co.ukReviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

15c344f4

drm/imx: Remove unneeded definition for structure imx_drm_component · 7d5ed292

Liu Ying authored Mar 15, 2017

No one is using the structure imx_drm_component, so let's remove the
definition to save several lines.
Signed-off-by: Liu Ying <gnuiyl@gmail.com>
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>

7d5ed292

drm/imx: use PRG/PRE when possible · 00514e85

Lucas Stach authored Mar 08, 2017

Allow the planes to use the PRG/PRE units as linear prefetchers when
possible. This improves DRAM efficiency a bit and reduces the chance
for display underflow when the memory subsystem is under load.

This does not yet support scanning out tiled buffers directly, as this
needs more work, but it already wires up the basic interaction between
imx-drm, the IPUv3 driver and the PRG and PRE drivers.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>

00514e85

drm/imx: enable/disable PRG on CRTC enable/disable · e0fb7dd2

Lucas Stach authored Mar 08, 2017

On i.MX6 QuadPlus the PRG needs to be clocked in order to pass
through the data access requests from the IDMAC. This call is a
no-op for other all other SoCs.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>

e0fb7dd2

gpu: ipu-v3: only set non-zero AXI ID for IC when PRG is absent · 320a89ad

Lucas Stach authored Mar 08, 2017

Using non-zero AXI IDs for anything other than the display channels
collides with the PRG AXI snooping, so only do this if there is no
PRG present.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>

320a89ad

gpu: ipu-v3: hook up PRG unit · 92681fe7

Lucas Stach authored Mar 08, 2017

The i.MX6 QuadPlus IPU needs to PRG unit to gain access to the
data bus. Make sure it is present and available to be used.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>

92681fe7

gpu: ipu-v3: document valid IPUv3 compatibles and extend for i.MX6 QuadPlus · 0d6c9a42

Lucas Stach authored Mar 08, 2017

Document the valid compatible strings for the IPUv3.

On i.MX6 QuadPlus the IPU needs to know which PRG has to be
used for this IPU instance. Add a "fsl,prg" property containing
a phandle pointing to the correct PRG device.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>

0d6c9a42

gpu: ipu-v3: add driver for Prefetch Resolve Gasket · ea9c2605

Lucas Stach authored Mar 08, 2017

This adds support for the i.MX6 QUadPlus PRG unit. It glues together the
IPU and the PRE units.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
---
v4: add missing ipu_soc->prg_priv

ea9c2605