An error occurred fetching the project authors.
  1. 28 May, 2019 5 commits
  2. 20 May, 2019 2 commits
  3. 02 May, 2019 1 commit
  4. 01 May, 2019 1 commit
  5. 26 Apr, 2019 2 commits
  6. 24 Apr, 2019 4 commits
    • Chris Wilson's avatar
      drm/i915: Invert the GEM wakeref hierarchy · 79ffac85
      Chris Wilson authored
      In the current scheme, on submitting a request we take a single global
      GEM wakeref, which trickles down to wake up all GT power domains. This
      is undesirable as we would like to be able to localise our power
      management to the available power domains and to remove the global GEM
      operations from the heart of the driver. (The intent there is to push
      global GEM decisions to the boundary as used by the GEM user interface.)
      
      Now during request construction, each request is responsible via its
      logical context to acquire a wakeref on each power domain it intends to
      utilize. Currently, each request takes a wakeref on the engine(s) and
      the engines themselves take a chipset wakeref. This gives us a
      transition on each engine which we can extend if we want to insert more
      powermangement control (such as soft rc6). The global GEM operations
      that currently require a struct_mutex are reduced to listening to pm
      events from the chipset GT wakeref. As we reduce the struct_mutex
      requirement, these listeners should evaporate.
      
      Perhaps the biggest immediate change is that this removes the
      struct_mutex requirement around GT power management, allowing us greater
      flexibility in request construction. Another important knock-on effect,
      is that by tracking engine usage, we can insert a switch back to the
      kernel context on that engine immediately, avoiding any extra delay or
      inserting global synchronisation barriers. This makes tracking when an
      engine and its associated contexts are idle much easier -- important for
      when we forgo our assumed execution ordering and need idle barriers to
      unpin used contexts. In the process, it means we remove a large chunk of
      code whose only purpose was to switch back to the kernel context.
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Cc: Imre Deak <imre.deak@intel.com>
      Reviewed-by: default avatarTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190424200717.1686-5-chris@chris-wilson.co.uk
      79ffac85
    • Chris Wilson's avatar
      drm/i915: Pull the GEM powermangement coupling into its own file · 23c3c3d0
      Chris Wilson authored
      Split out the powermanagement portion (GT wakeref, suspend/resume) of
      GEM from i915_gem.c into its own file.
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: default avatarTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190424200717.1686-2-chris@chris-wilson.co.uk
      23c3c3d0
    • Chris Wilson's avatar
      drm/i915: Move GraphicsTechnology files under gt/ · 112ed2d3
      Chris Wilson authored
      Start partitioning off the code that talks to the hardware (GT) from the
      uapi layers and move the device facing code under gt/
      
      One casualty is s/intel_ringbuffer.h/intel_engine.h/ with the plan to
      subdivide that header and body further (and split out the submission
      code from the ringbuffer and logical context handling). This patch aims
      to be simple motion so git can fixup inflight patches with little mess.
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Acked-by: default avatarJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Acked-by: default avatarJani Nikula <jani.nikula@intel.com>
      Acked-by: default avatarRodrigo Vivi <rodrigo.vivi@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190424174839.7141-1-chris@chris-wilson.co.uk
      112ed2d3
    • Chris Wilson's avatar
      drm/i915: Avoid use-after-free in reporting create.size · 929eec99
      Chris Wilson authored
      We have to avoid chasing after a userspace race!
      
      <3>[  473.114328] BUG: KASAN: use-after-free in i915_gem_create+0x1d2/0x1f0 [i915]
      <3>[  473.114389] Read of size 8 at addr ffff88815bf1d840 by task gem_flink_race/1541
      
      <4>[  473.114464] CPU: 1 PID: 1541 Comm: gem_flink_race Tainted: G     U            5.1.0-rc4-g7d07e025e786-kasan_88+ #1
      <4>[  473.114469] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./J4205-ITX, BIOS P1.10 09/29/2016
      <4>[  473.114474] Call Trace:
      <4>[  473.114488]  dump_stack+0x7c/0xbb
      <4>[  473.114612]  ? i915_gem_create+0x1d2/0x1f0 [i915]
      <4>[  473.114621]  print_address_description+0x65/0x270
      <4>[  473.114728]  ? i915_gem_create+0x1d2/0x1f0 [i915]
      <4>[  473.114839]  ? i915_gem_create+0x1d2/0x1f0 [i915]
      <4>[  473.114848]  kasan_report+0x149/0x18d
      <4>[  473.114962]  ? i915_gem_create+0x1d2/0x1f0 [i915]
      <4>[  473.115069]  i915_gem_create+0x1d2/0x1f0 [i915]
      <4>[  473.115176]  ? i915_gem_object_create.part.28+0x4b0/0x4b0 [i915]
      <4>[  473.115289]  ? i915_gem_dumb_create+0x1a0/0x1a0 [i915]
      <4>[  473.115297]  drm_ioctl_kernel+0x192/0x260
      <4>[  473.115306]  ? drm_ioctl_permit+0x280/0x280
      <4>[  473.115326]  drm_ioctl+0x67c/0x960
      <4>[  473.115438]  ? i915_gem_dumb_create+0x1a0/0x1a0 [i915]
      <4>[  473.115448]  ? drm_getstats+0x20/0x20
      <4>[  473.115459]  ? __lock_acquire+0xa66/0x3fe0
      <4>[  473.115474]  ? _raw_spin_unlock_irqrestore+0x39/0x60
      <4>[  473.115485]  ? debug_object_active_state+0x2ea/0x4e0
      <4>[  473.115496]  ? debug_show_all_locks+0x2d0/0x2d0
      <4>[  473.115513]  do_vfs_ioctl+0x18d/0xfa0
      <4>[  473.115522]  ? check_flags.part.27+0x440/0x440
      <4>[  473.115532]  ? ioctl_preallocate+0x1a0/0x1a0
      <4>[  473.115547]  ? __fget+0x2ac/0x410
      <4>[  473.115561]  ? __ia32_sys_dup3+0xb0/0xb0
      <4>[  473.115569]  ? rwlock_bug.part.0+0x90/0x90
      <4>[  473.115590]  ksys_ioctl+0x35/0x70
      <4>[  473.115597]  ? lockdep_hardirqs_off+0x1cb/0x2b0
      <4>[  473.115608]  __x64_sys_ioctl+0x6a/0xb0
      <4>[  473.115614]  ? lockdep_hardirqs_on+0x342/0x590
      <4>[  473.115623]  do_syscall_64+0x97/0x400
      <4>[  473.115633]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
      <4>[  473.115641] RIP: 0033:0x7fce590d55d7
      <4>[  473.115649] Code: b3 66 90 48 8b 05 b1 48 2d 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 81 48 2d 00 f7 d8 64 89 01 48
      <4>[  473.115655] RSP: 002b:00007fce4d525ba8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
      <4>[  473.115662] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fce590d55d7
      <4>[  473.115667] RDX: 00007fce4d525c10 RSI: 00000000c010645b RDI: 0000000000000007
      <4>[  473.115672] RBP: 00007fce4d525c10 R08: 00007fce4d526700 R09: 00007fce4d526700
      <4>[  473.115677] R10: 0000000000000054 R11: 0000000000000246 R12: 00000000c010645b
      <4>[  473.115682] R13: 0000000000000007 R14: 0000000000000000 R15: 00007ffe0e4a7450
      
      <3>[  473.115731] Allocated by task 1541:
      <4>[  473.115766]  kmem_cache_alloc+0xce/0x290
      <4>[  473.115895]  i915_gem_object_create.part.28+0x1c/0x4b0 [i915]
      <4>[  473.116000]  i915_gem_create+0xe3/0x1f0 [i915]
      <4>[  473.116008]  drm_ioctl_kernel+0x192/0x260
      <4>[  473.116013]  drm_ioctl+0x67c/0x960
      <4>[  473.116020]  do_vfs_ioctl+0x18d/0xfa0
      <4>[  473.116026]  ksys_ioctl+0x35/0x70
      <4>[  473.116032]  __x64_sys_ioctl+0x6a/0xb0
      <4>[  473.116038]  do_syscall_64+0x97/0x400
      <4>[  473.116044]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      <3>[  473.116071] Freed by task 1542:
      <4>[  473.116101]  kmem_cache_free+0xb7/0x2f0
      <4>[  473.116205]  __i915_gem_free_objects+0x7d4/0xe10 [i915]
      <4>[  473.116311]  i915_gem_create_ioctl+0xaa/0xd0 [i915]
      <4>[  473.116318]  drm_ioctl_kernel+0x192/0x260
      <4>[  473.116323]  drm_ioctl+0x67c/0x960
      <4>[  473.116330]  do_vfs_ioctl+0x18d/0xfa0
      <4>[  473.116335]  ksys_ioctl+0x35/0x70
      <4>[  473.116341]  __x64_sys_ioctl+0x6a/0xb0
      <4>[  473.116347]  do_syscall_64+0x97/0x400
      <4>[  473.116354]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      Testcase: igt/gem_flink_race/flink_close
      Fixes: e163484a ("drm/i915: Update size upon return from GEM_CREATE")
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Cc: Michał Winiarski <michal.winiarski@intel.com>
      Reviewed-by: default avatarTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190417132507.27133-1-chris@chris-wilson.co.uk
      (cherry picked from commit 99534023)
      Signed-off-by: default avatarJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
      929eec99
  7. 20 Apr, 2019 1 commit
    • Chris Wilson's avatar
      drm/i915: Start writeback from the shrinker · 2d6692e6
      Chris Wilson authored
      When we are called to relieve mempressue via the shrinker, the only way
      we can make progress is either by discarding unwanted pages (those
      objects that userspace has marked MADV_DONTNEED) or by reclaiming the
      dirty objects via swap. As we know that is the only way to make further
      progress, we can initiate the writeback as we invalidate the objects.
      This means the objects we put onto the inactive anon lru list are
      already marked for reclaim+writeback and so will trigger a wait upon the
      writeback inside direct reclaim, greatly improving the success rate of
      direct reclaim on i915 objects.
      
      The corollary is that we may start a slow swap on opportunistic
      mempressure from the likes of the compaction + migration kthreads. This
      is limited by those threads only being allowed to shrink idle pages, but
      also that if we reactivate the page before it is swapped out by gpu
      activity, we only page the cost of repinning the page. The cost is most
      felt when an object is reused after mempressure, which hopefully
      excludes the latency sensitive tasks (as we are just extending the
      impact of swap thrashing to them).
      
      Apparently this is not the first time we've had this idea. Back in
      commit 5537252b ("drm/i915: Invalidate our pages under memory
      pressure") we wanted to start writeback but settled on invalidate after
      Hugh Dickins warned us about a possibility of a deadlock within shmemfs
      if we started writeback from shrink_slab. Looking at the callchain,
      using writeback from i915_gem_shrink should be equivalent to the pageout
      also employed by shrink_slab, i.e. it should not be any riskier afaict.
      
      v2: Leave mmapings intact. At this point, the only mmapings of our
      objects will be via CPU mmaps on the shmemfs filp, which are
      out-of-scope for our LRU tracking. Instead leave those pages to the
      inactive anon LRU page list for aging and pageout as normal.
      
      v3: Be selective on which paths trigger writeback, in particular
      excluding paths shrinking just to reclaim vm space (e.g. mmap, vmap
      reapers) and avoid starting writeback on the entire process space from
      within the pm freezer.
      
      References: https://bugs.freedesktop.org/show_bug.cgi?id=108686Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Cc: Matthew Auld <matthew.auld@intel.com>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Cc: Michal Hocko <mhocko@suse.com>
      Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> #v1
      Link: https://patchwork.freedesktop.org/patch/msgid/20190420115539.29081-1-chris@chris-wilson.co.uk
      2d6692e6
  8. 17 Apr, 2019 2 commits
    • Chris Wilson's avatar
      drm/i915: Avoid use-after-free in reporting create.size · 99534023
      Chris Wilson authored
      We have to avoid chasing after a userspace race!
      
      <3>[  473.114328] BUG: KASAN: use-after-free in i915_gem_create+0x1d2/0x1f0 [i915]
      <3>[  473.114389] Read of size 8 at addr ffff88815bf1d840 by task gem_flink_race/1541
      
      <4>[  473.114464] CPU: 1 PID: 1541 Comm: gem_flink_race Tainted: G     U            5.1.0-rc4-g7d07e025e786-kasan_88+ #1
      <4>[  473.114469] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./J4205-ITX, BIOS P1.10 09/29/2016
      <4>[  473.114474] Call Trace:
      <4>[  473.114488]  dump_stack+0x7c/0xbb
      <4>[  473.114612]  ? i915_gem_create+0x1d2/0x1f0 [i915]
      <4>[  473.114621]  print_address_description+0x65/0x270
      <4>[  473.114728]  ? i915_gem_create+0x1d2/0x1f0 [i915]
      <4>[  473.114839]  ? i915_gem_create+0x1d2/0x1f0 [i915]
      <4>[  473.114848]  kasan_report+0x149/0x18d
      <4>[  473.114962]  ? i915_gem_create+0x1d2/0x1f0 [i915]
      <4>[  473.115069]  i915_gem_create+0x1d2/0x1f0 [i915]
      <4>[  473.115176]  ? i915_gem_object_create.part.28+0x4b0/0x4b0 [i915]
      <4>[  473.115289]  ? i915_gem_dumb_create+0x1a0/0x1a0 [i915]
      <4>[  473.115297]  drm_ioctl_kernel+0x192/0x260
      <4>[  473.115306]  ? drm_ioctl_permit+0x280/0x280
      <4>[  473.115326]  drm_ioctl+0x67c/0x960
      <4>[  473.115438]  ? i915_gem_dumb_create+0x1a0/0x1a0 [i915]
      <4>[  473.115448]  ? drm_getstats+0x20/0x20
      <4>[  473.115459]  ? __lock_acquire+0xa66/0x3fe0
      <4>[  473.115474]  ? _raw_spin_unlock_irqrestore+0x39/0x60
      <4>[  473.115485]  ? debug_object_active_state+0x2ea/0x4e0
      <4>[  473.115496]  ? debug_show_all_locks+0x2d0/0x2d0
      <4>[  473.115513]  do_vfs_ioctl+0x18d/0xfa0
      <4>[  473.115522]  ? check_flags.part.27+0x440/0x440
      <4>[  473.115532]  ? ioctl_preallocate+0x1a0/0x1a0
      <4>[  473.115547]  ? __fget+0x2ac/0x410
      <4>[  473.115561]  ? __ia32_sys_dup3+0xb0/0xb0
      <4>[  473.115569]  ? rwlock_bug.part.0+0x90/0x90
      <4>[  473.115590]  ksys_ioctl+0x35/0x70
      <4>[  473.115597]  ? lockdep_hardirqs_off+0x1cb/0x2b0
      <4>[  473.115608]  __x64_sys_ioctl+0x6a/0xb0
      <4>[  473.115614]  ? lockdep_hardirqs_on+0x342/0x590
      <4>[  473.115623]  do_syscall_64+0x97/0x400
      <4>[  473.115633]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
      <4>[  473.115641] RIP: 0033:0x7fce590d55d7
      <4>[  473.115649] Code: b3 66 90 48 8b 05 b1 48 2d 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 81 48 2d 00 f7 d8 64 89 01 48
      <4>[  473.115655] RSP: 002b:00007fce4d525ba8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
      <4>[  473.115662] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fce590d55d7
      <4>[  473.115667] RDX: 00007fce4d525c10 RSI: 00000000c010645b RDI: 0000000000000007
      <4>[  473.115672] RBP: 00007fce4d525c10 R08: 00007fce4d526700 R09: 00007fce4d526700
      <4>[  473.115677] R10: 0000000000000054 R11: 0000000000000246 R12: 00000000c010645b
      <4>[  473.115682] R13: 0000000000000007 R14: 0000000000000000 R15: 00007ffe0e4a7450
      
      <3>[  473.115731] Allocated by task 1541:
      <4>[  473.115766]  kmem_cache_alloc+0xce/0x290
      <4>[  473.115895]  i915_gem_object_create.part.28+0x1c/0x4b0 [i915]
      <4>[  473.116000]  i915_gem_create+0xe3/0x1f0 [i915]
      <4>[  473.116008]  drm_ioctl_kernel+0x192/0x260
      <4>[  473.116013]  drm_ioctl+0x67c/0x960
      <4>[  473.116020]  do_vfs_ioctl+0x18d/0xfa0
      <4>[  473.116026]  ksys_ioctl+0x35/0x70
      <4>[  473.116032]  __x64_sys_ioctl+0x6a/0xb0
      <4>[  473.116038]  do_syscall_64+0x97/0x400
      <4>[  473.116044]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      <3>[  473.116071] Freed by task 1542:
      <4>[  473.116101]  kmem_cache_free+0xb7/0x2f0
      <4>[  473.116205]  __i915_gem_free_objects+0x7d4/0xe10 [i915]
      <4>[  473.116311]  i915_gem_create_ioctl+0xaa/0xd0 [i915]
      <4>[  473.116318]  drm_ioctl_kernel+0x192/0x260
      <4>[  473.116323]  drm_ioctl+0x67c/0x960
      <4>[  473.116330]  do_vfs_ioctl+0x18d/0xfa0
      <4>[  473.116335]  ksys_ioctl+0x35/0x70
      <4>[  473.116341]  __x64_sys_ioctl+0x6a/0xb0
      <4>[  473.116347]  do_syscall_64+0x97/0x400
      <4>[  473.116354]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      Testcase: igt/gem_flink_race/flink_close
      Fixes: e163484a ("drm/i915: Update size upon return from GEM_CREATE")
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Cc: Michał Winiarski <michal.winiarski@intel.com>
      Reviewed-by: default avatarTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190417132507.27133-1-chris@chris-wilson.co.uk
      99534023
    • Chris Wilson's avatar
      drm/i915: Verify the engine workarounds stick on application · 254e1186
      Chris Wilson authored
      Read the engine workarounds back using the GPU after loading the initial
      context state to verify that we are setting them correctly, and bail if
      it fails.
      
      v2: Break out the verification into its own loop
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Reviewed-by: default avatarTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190417075657.19456-3-chris@chris-wilson.co.uk
      254e1186
  9. 10 Apr, 2019 1 commit
  10. 08 Apr, 2019 1 commit
  11. 04 Apr, 2019 1 commit
  12. 02 Apr, 2019 1 commit
    • Chris Wilson's avatar
      drm/i915: Prefault before locking pages in shmem_pwrite · b01720bf
      Chris Wilson authored
      If the user passes in a pointer to a GGTT mmaping of the same buffer
      being written to, we can hit a deadlock in acquiring the shmemfs page
      (once as the write destination and then as the read source).
      
      [<0>] io_schedule+0xd/0x30
      [<0>] __lock_page+0x105/0x1b0
      [<0>] find_lock_entry+0x55/0x90
      [<0>] shmem_getpage_gfp+0xbb/0x800
      [<0>] shmem_read_mapping_page_gfp+0x2d/0x50
      [<0>] shmem_get_pages+0x158/0x5d0 [i915]
      [<0>] ____i915_gem_object_get_pages+0x17/0x90 [i915]
      [<0>] __i915_gem_object_get_pages+0x57/0x70 [i915]
      [<0>] i915_gem_fault+0x1b4/0x5c0 [i915]
      [<0>] __do_fault+0x2d/0x80
      [<0>] __handle_mm_fault+0xad4/0xfb0
      [<0>] handle_mm_fault+0xe6/0x1f0
      [<0>] __do_page_fault+0x18f/0x3f0
      [<0>] page_fault+0x1b/0x20
      [<0>] copy_user_enhanced_fast_string+0x7/0x10
      [<0>] _copy_from_user+0x37/0x60
      [<0>] shmem_pwrite+0xf0/0x160 [i915]
      [<0>] i915_gem_pwrite_ioctl+0x14e/0x520 [i915]
      [<0>] drm_ioctl_kernel+0x81/0xd0
      [<0>] drm_ioctl+0x1a7/0x310
      [<0>] do_vfs_ioctl+0x88/0x5d0
      [<0>] ksys_ioctl+0x35/0x70
      [<0>] __x64_sys_ioctl+0x11/0x20
      [<0>] do_syscall_64+0x39/0xe0
      [<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      We can reduce (but not eliminate!) the chance of this happening by
      faulting the user_data before we take the page lock in
      pagecache_write_begin(). One way to eliminate the potential recursion
      here is by disabling pagefaults for the copy, and handling the fallback
      to use an alternative method -- so convert to use kmap_atomic (which
      should disable preemption and pagefaulting for the copy) and report
      ENODEV instead of EFAULT so that our caller tries again with a different
      copy mechanism -- we already check that the page should have been
      faultable so a false negative should be rare.
      
      Testcase: igt/gem_pwrite/self
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Cc: Matthew Auld <matthew.william.auld@gmail.com>
      Reviewed-by: default avatarMatthew Auld <matthew.william.auld@gmail.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190401133909.31203-1-chris@chris-wilson.co.uk
      b01720bf
  13. 31 Mar, 2019 1 commit
  14. 26 Mar, 2019 1 commit
  15. 24 Mar, 2019 1 commit
  16. 22 Mar, 2019 1 commit
  17. 21 Mar, 2019 2 commits
    • Chris Wilson's avatar
      drm/i915: Skip object locking around a no-op set-domain ioctl · 754a2544
      Chris Wilson authored
      If we are already in the desired write domain of a set-domain ioctl,
      then there is nothing for us to do and we can quickly return back to
      userspace, avoiding any lock contention. By recognising that the
      write_domain is always a subset of the read_domains, and excluding the
      no-op case of requiring 0 read_domains in the ioctl, we can infer if the
      current write_domain matches the target read_domains, there is nothing
      for us to do.
      
      Secondary aspect of this is that we undo the arbitrary fetching and
      potential flushing of all pages for a set-domain(.write=CPU) call on a
      fresh object -- which was introduced simply because we do the get-pages
      before taking the struct_mutex.
      
      References: 40e62d5d ("drm/i915: Acquire the backing storage outside of struct_mutex in set-domain")
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Cc: Matthew Auld <matthew.william.auld@gmail.com>
      Reviewed-by: default avatarMatthew Auld <matthew.william.auld@gmail.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190321161908.8007-2-chris@chris-wilson.co.uk
      754a2544
    • Chris Wilson's avatar
      drm/i915: Flush pages on acquisition · a679f58d
      Chris Wilson authored
      When we return pages to the system, we ensure that they are marked as
      being in the CPU domain since any external access is uncontrolled and we
      must assume the worst. This means that we need to always flush the pages
      on acquisition if we need to use them on the GPU, and from the beginning
      have used set-domain. Set-domain is overkill for the purpose as it is a
      general synchronisation barrier, but our intent is to only flush the
      pages being swapped in. If we move that flush into the pages acquisition
      phase, we know then that when we have obj->mm.pages, they are coherent
      with the GPU and need only maintain that status without resorting to
      heavy handed use of set-domain.
      
      The principle knock-on effect for userspace is through mmap-gtt
      pagefaulting. Our uAPI has always implied that the GTT mmap was async
      (especially as when any pagefault occurs is unpredicatable to userspace)
      and so userspace had to apply explicit domain control itself
      (set-domain). However, swapping is transparent to the kernel, and so on
      first fault we need to acquire the pages and make them coherent for
      access through the GTT. Our use of set-domain here leaks into the uABI
      that the first pagefault was synchronous. This is unintentional and
      baring a few igt should be unoticed, nevertheless we bump the uABI
      version for mmap-gtt to reflect the change in behaviour.
      
      Another implication of the change is that gem_create() is presumed to
      create an object that is coherent with the CPU and is in the CPU write
      domain, so a set-domain(CPU) following a gem_create() would be a minor
      operation that merely checked whether we could allocate all pages for
      the object. On applying this change, a set-domain(CPU) causes a clflush
      as we acquire the pages. This will have a small impact on mesa as we move
      the clflush here on !llc from execbuf time to create, but that should
      have minimal performance impact as the same clflush exists but is now
      done early and because of the clflush issue, userspace recycles bo and
      so should resist allocating fresh objects.
      
      Internally, the presumption that objects are created in the CPU
      write-domain and remain so through writes to obj->mm.mapping is more
      prevalent than I expected; but easy enough to catch and apply a manual
      flush.
      
      For the future, we should push the page flush from the central
      set_pages() into the callers so that we can more finely control when it
      is applied, but for now doing it one location is easier to validate, at
      the cost of sometimes flushing when there is no need.
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Cc: Matthew Auld <matthew.william.auld@gmail.com>
      Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
      Cc: Antonio Argenziano <antonio.argenziano@intel.com>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Reviewed-by: default avatarMatthew Auld <matthew.william.auld@gmail.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20190321161908.8007-1-chris@chris-wilson.co.uk
      a679f58d
  18. 20 Mar, 2019 2 commits
  19. 18 Mar, 2019 2 commits
  20. 08 Mar, 2019 8 commits