1. 23 Apr, 2020 5 commits
  2. 22 Apr, 2020 4 commits
    • Chris Wilson's avatar
      drm/i915/execlists: Drop request-before-CS assertion · 15501287
      Chris Wilson authored
      When we migrated to execlists, one of the conditions we wanted to test
      for was whether the breadcrumb seqno was being written before the
      breadcumb interrupt was delivered. This was following on from issues
      observed on previous generations which were not so strongly ordered. With
      the removal of the missed interrupt detection, we have not reliable
      means of detecting the out-of-order seqno/interrupt but instead tried to
      assert that the relationship between the CS event interrupt and the
      breadwrite should be strongly ordered. However, Icelake proves it is
      possible for the HW implementation to forget about minor little details
      such as write ordering and so the order between *processing* the CS
      event and the breadcrumb is unreliable.
      
      Remove the unreliable assertion, but leave a debug telltale in case we
      have reason to suspect.
      
      Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/1658Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
      Reviewed-by: default avatarMika Kuoppala <mika.kuoppala@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20200422141749.28709-1-chris@chris-wilson.co.uk
      15501287
    • Chris Wilson's avatar
      drm/i915/gem: Hold obj->vma.lock over for_each_ggtt_vma() · cb593e5d
      Chris Wilson authored
      While the ggtt vma are protected by their object lifetime, the list
      continues until it hits a non-ggtt vma, and that vma is not protected
      and may be freed as we inspect it. Hence, we require the obj->vma.lock
      to protect the list as we iterate.
      
      An example of forgetting to hold the obj->vma.lock is
      
      [1642834.464973] general protection fault, probably for non-canonical address 0xdead000000000122: 0000 [#1] SMP PTI
      [1642834.464977] CPU: 3 PID: 1954 Comm: Xorg Not tainted 5.6.0-300.fc32.x86_64 #1
      [1642834.464979] Hardware name: LENOVO 20ARS25701/20ARS25701, BIOS GJET94WW (2.44 ) 09/14/2017
      [1642834.465021] RIP: 0010:i915_gem_object_set_tiling+0x2c0/0x3e0 [i915]
      [1642834.465024] Code: 8b 84 24 18 01 00 00 f6 c4 80 74 59 49 8b 94 24 a0 00 00 00 49 8b 84 24 e0 00 00 00 49 8b 74 24 10 48 8b 92 30 01 00 00 89 c7 <80> ba 0a 06 00 00 03 0f 87 86 00 00 00 ba 00 00 08 00 b9 00 00 10
      [1642834.465025] RSP: 0018:ffffa98780c77d60 EFLAGS: 00010282
      [1642834.465028] RAX: ffff8d232bfb2578 RBX: 0000000000000002 RCX: ffff8d25873a0000
      [1642834.465029] RDX: dead000000000122 RSI: fffff0af8ac6e408 RDI: 000000002bfb2578
      [1642834.465030] RBP: ffff8d25873a0000 R08: ffff8d252bfb5638 R09: 0000000000000000
      [1642834.465031] R10: 0000000000000000 R11: ffff8d252bfb5640 R12: ffffa987801cb8f8
      [1642834.465032] R13: 0000000000001000 R14: ffff8d233e972e50 R15: ffff8d233e972d00
      [1642834.465034] FS:  00007f6a3d327f00(0000) GS:ffff8d25926c0000(0000) knlGS:0000000000000000
      [1642834.465036] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [1642834.465037] CR2: 00007f6a2064d000 CR3: 00000002fb57c001 CR4: 00000000001606e0
      [1642834.465038] Call Trace:
      [1642834.465083]  i915_gem_set_tiling_ioctl+0x122/0x230 [i915]
      [1642834.465121]  ? i915_gem_object_set_tiling+0x3e0/0x3e0 [i915]
      [1642834.465151]  drm_ioctl_kernel+0x86/0xd0 [drm]
      [1642834.465156]  ? avc_has_perm+0x3b/0x160
      [1642834.465178]  drm_ioctl+0x206/0x390 [drm]
      [1642834.465216]  ? i915_gem_object_set_tiling+0x3e0/0x3e0 [i915]
      [1642834.465221]  ? selinux_file_ioctl+0x122/0x1c0
      [1642834.465226]  ? __do_munmap+0x24b/0x4d0
      [1642834.465231]  ksys_ioctl+0x82/0xc0
      [1642834.465235]  __x64_sys_ioctl+0x16/0x20
      [1642834.465238]  do_syscall_64+0x5b/0xf0
      [1642834.465243]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
      [1642834.465245] RIP: 0033:0x7f6a3d7b047b
      [1642834.465247] Code: 0f 1e fa 48 8b 05 1d aa 0c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d ed a9 0c 00 f7 d8 64 89 01 48
      [1642834.465249] RSP: 002b:00007ffe71adba28 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
      [1642834.465251] RAX: ffffffffffffffda RBX: 000055f99048fa40 RCX: 00007f6a3d7b047b
      [1642834.465253] RDX: 00007ffe71adba30 RSI: 00000000c0106461 RDI: 000000000000000e
      [1642834.465254] RBP: 0000000000000002 R08: 000055f98f3f1798 R09: 0000000000000002
      [1642834.465255] R10: 0000000000001000 R11: 0000000000000246 R12: 0000000000000080
      [1642834.465257] R13: 000055f98f3f1690 R14: 00000000c0106461 R15: 00007ffe71adba30
      
      Now to take the spinlock during the list iteration, we need to break it
      down into two phases. In the first phase under the lock, we cannot sleep
      and so must defer the actual work to a second list, protected by the
      ggtt->mutex.
      
      We also need to hold the spinlock during creation of a new vma to
      serialise with updates of the tiling on the object.
      Reported-by: default avatarDave Airlie <airlied@redhat.com>
      Fixes: 2850748e ("drm/i915: Pull i915_vma_pin under the vm->mutex")
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
      Cc: Dave Airlie <airlied@redhat.com>
      Cc: <stable@vger.kernel.org> # v5.5+
      Reviewed-by: default avatarTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20200422072805.17340-1-chris@chris-wilson.co.uk
      cb593e5d
    • Chris Wilson's avatar
      drm/i915/selftests: Try to detect rollback during batchbuffer preemption · c92724de
      Chris Wilson authored
      Since batch buffers dominant execution time, most preemption requests
      should naturally occur during execution of a batch buffer. We wish to
      verify that should a preemption occur within a batch buffer, when we
      come to restart that batch buffer, it occurs at the interrupted
      instruction and most importantly does not rollback to an earlier point.
      
      v2: Do not clear the GPR at the start of the batch, but rely on them
      being clear for new contexts.
      Suggested-by: default avatarMika Kuoppala <mika.kuoppala@linux.intel.com>
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
      Reviewed-by: default avatarMika Kuoppala <mika.kuoppala@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20200422100903.25216-1-chris@chris-wilson.co.uk
      c92724de
    • Chris Wilson's avatar
      drm/i915/selftests: Disable heartbeat around RPS interrupt testing · cbb6f880
      Chris Wilson authored
      For verifying reciving the EI interrupts, we need to hold the GPU in
      very precise conditions (in terms of C0 cycles during the EI). If we
      preempt the busy load to handle the heartbeat, this may perturb the busy
      load causing us to miss the interrupt.
      
      The other tests, while not as time sensitive, may also be slightly
      perturbed, so apply the heartbeat protection across all the
      measurements.
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Reviewed-by: default avatarMika Kuoppala <mika.kuoppala@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20200422083855.26842-1-chris@chris-wilson.co.uk
      cbb6f880
  3. 21 Apr, 2020 18 commits
  4. 20 Apr, 2020 13 commits