1. 22 Oct, 2021 13 commits
    • Matthew Brost's avatar
      drm/i915/guc: Fix recursive lock in GuC submission · 12a9917e
      Matthew Brost authored
      Use __release_guc_id (lock held) rather than release_guc_id (acquires
      lock), add lockdep annotations.
      
      213.280129] i915: Running i915_perf_live_selftests/live_noa_gpr
      [ 213.283459] ============================================
      [ 213.283462] WARNING: possible recursive locking detected
      {{[ 213.283466] 5.15.0-rc6+ #18 Tainted: G U W }}
      [ 213.283470] --------------------------------------------
      [ 213.283472] kworker/u24:0/8 is trying to acquire lock:
      [ 213.283475] ffff8ffc4f6cc1e8 (&guc->submission_state.lock){....}-{2:2}, at: destroyed_worker_func+0x2df/0x350 [i915]
      {{[ 213.283618] }}
      {{ but task is already holding lock:}}
      [ 213.283621] ffff8ffc4f6cc1e8 (&guc->submission_state.lock){....}-{2:2}, at: destroyed_worker_func+0x4f/0x350 [i915]
      {{[ 213.283720] }}
      {{ other info that might help us debug this:}}
      [ 213.283724] Possible unsafe locking scenario:[ 213.283727] CPU0
      [ 213.283728] ----
      [ 213.283730] lock(&guc->submission_state.lock);
      [ 213.283734] lock(&guc->submission_state.lock);
      {{[ 213.283737] }}
      {{ *** DEADLOCK ***}}[ 213.283740] May be due to missing lock nesting notation[ 213.283744] 3 locks held by kworker/u24:0/8:
      [ 213.283747] #0: ffff8ffb80059d38 ((wq_completion)events_unbound){..}-{0:0}, at: process_one_work+0x1f3/0x550
      [ 213.283757] #1: ffffb509000e3e78 ((work_completion)(&guc->submission_state.destroyed_worker)){..}-{0:0}, at: process_one_work+0x1f3/0x550
      [ 213.283766] #2: ffff8ffc4f6cc1e8 (&guc->submission_state.lock){....}-{2:2}, at: destroyed_worker_func+0x4f/0x350 [i915]
      {{[ 213.283860] }}
      {{ stack backtrace:}}
      [ 213.283863] CPU: 8 PID: 8 Comm: kworker/u24:0 Tainted: G U W 5.15.0-rc6+ #18
      [ 213.283868] Hardware name: ASUS System Product Name/PRIME B560M-A AC, BIOS 0403 01/26/2021
      [ 213.283873] Workqueue: events_unbound destroyed_worker_func [i915]
      [ 213.283957] Call Trace:
      [ 213.283960] dump_stack_lvl+0x57/0x72
      [ 213.283966] __lock_acquire.cold+0x191/0x2d3
      [ 213.283972] lock_acquire+0xb5/0x2b0
      [ 213.283978] ? destroyed_worker_func+0x2df/0x350 [i915]
      [ 213.284059] ? destroyed_worker_func+0x2d7/0x350 [i915]
      [ 213.284139] ? lock_release+0xb9/0x280
      [ 213.284143] _raw_spin_lock_irqsave+0x48/0x60
      [ 213.284148] ? destroyed_worker_func+0x2df/0x350 [i915]
      [ 213.284226] destroyed_worker_func+0x2df/0x350 [i915]
      [ 213.284310] process_one_work+0x270/0x550
      [ 213.284315] worker_thread+0x52/0x3b0
      [ 213.284319] ? process_one_work+0x550/0x550
      [ 213.284322] kthread+0x135/0x160
      [ 213.284326] ? set_kthread_struct+0x40/0x40
      [ 213.284331] ret_from_fork+0x1f/0x30
      
      and a bit later in the trace:
      
      {{ 227.499864] do_raw_spin_lock+0x94/0xa0}}
      [ 227.499868] _raw_spin_lock_irqsave+0x50/0x60
      [ 227.499871] ? guc_flush_destroyed_contexts+0x4f/0xf0 [i915]
      [ 227.499995] guc_flush_destroyed_contexts+0x4f/0xf0 [i915]
      [ 227.500104] intel_guc_submission_reset_prepare+0x99/0x4b0 [i915]
      [ 227.500209] ? mark_held_locks+0x49/0x70
      [ 227.500212] intel_uc_reset_prepare+0x46/0x50 [i915]
      [ 227.500320] reset_prepare+0x78/0x90 [i915]
      [ 227.500412] __intel_gt_set_wedged.part.0+0x13/0xe0 [i915]
      [ 227.500485] intel_gt_set_wedged.part.0+0x54/0x100 [i915]
      [ 227.500556] intel_gt_set_wedged_on_fini+0x1a/0x30 [i915]
      [ 227.500622] intel_gt_driver_unregister+0x1e/0x60 [i915]
      [ 227.500694] i915_driver_remove+0x4a/0xf0 [i915]
      [ 227.500767] i915_pci_probe+0x84/0x170 [i915]
      [ 227.500838] local_pci_probe+0x42/0x80
      [ 227.500842] pci_device_probe+0xd9/0x190
      [ 227.500844] really_probe+0x1f2/0x3f0
      [ 227.500847] __driver_probe_device+0xfe/0x180
      [ 227.500848] driver_probe_device+0x1e/0x90
      [ 227.500850] __driver_attach+0xc4/0x1d0
      [ 227.500851] ? __device_attach_driver+0xe0/0xe0
      [ 227.500853] ? __device_attach_driver+0xe0/0xe0
      [ 227.500854] bus_for_each_dev+0x64/0x90
      [ 227.500856] bus_add_driver+0x12e/0x1f0
      [ 227.500857] driver_register+0x8f/0xe0
      [ 227.500859] i915_init+0x1d/0x8f [i915]
      [ 227.500934] ? 0xffffffffc144a000
      [ 227.500936] do_one_initcall+0x58/0x2d0
      [ 227.500938] ? rcu_read_lock_sched_held+0x3f/0x80
      [ 227.500940] ? kmem_cache_alloc_trace+0x238/0x2d0
      [ 227.500944] do_init_module+0x5c/0x270
      [ 227.500946] __do_sys_finit_module+0x95/0xe0
      [ 227.500949] do_syscall_64+0x38/0x90
      [ 227.500951] entry_SYSCALL_64_after_hwframe+0x44/0xae
      [ 227.500953] RIP: 0033:0x7ffa59d2ae0d
      [ 227.500954] Code: c8 0c 00 0f 05 eb a9 66 0f 1f 44 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 3b 80 0c 00 f7 d8 64 89 01 48
      [ 227.500955] RSP: 002b:00007fff320bbf48 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
      [ 227.500956] RAX: ffffffffffffffda RBX: 00000000022ea710 RCX: 00007ffa59d2ae0d
      [ 227.500957] RDX: 0000000000000000 RSI: 00000000022e1d90 RDI: 0000000000000004
      [ 227.500958] RBP: 0000000000000020 R08: 00007ffa59df3a60 R09: 0000000000000070
      [ 227.500958] R10: 00000000022e1d90 R11: 0000000000000246 R12: 00000000022e1d90
      [ 227.500959] R13: 00000000022e58e0 R14: 0000000000000043 R15: 00000000022e42c0
      
      v2:
       (CI build)
        - Fix build error
      
      Fixes: 1a52faed ("drm/i915/guc: Take GT PM ref when deregistering context")
      Signed-off-by: default avatarMatthew Brost <matthew.brost@intel.com>
      Cc: stable@vger.kernel.org
      Reviewed-by: default avatarThomas Hellström <thomas.hellstrom@linux.intel.com>
      Signed-off-by: default avatarJohn Harrison <John.C.Harrison@Intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20211020192147.8048-1-matthew.brost@intel.com
      12a9917e
    • Matthew Brost's avatar
      drm/i915/selftests: Update live.evict to wait on requests / idle GPU after each loop · 393211e1
      Matthew Brost authored
      Update live.evict to wait on last request and idle GPU after each loop.
      This not only enhances the test to fill the GGTT on each engine class
      but also avoid timeouts from igt_flush_test when using GuC submission.
      igt_flush_test (idle GPU) can take a long time with GuC submission if
      losts of contexts are created due to H2G / G2H required to destroy
      contexts.
      Signed-off-by: default avatarMatthew Brost <matthew.brost@intel.com>
      Reviewed-by: default avatarThomas Hellström <thomas.hellstrom@linux.intel.com>
      Signed-off-by: default avatarJohn Harrison <John.C.Harrison@Intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20211021214040.33292-1-matthew.brost@intel.com
      393211e1
    • Matthew Brost's avatar
      drm/i915/selftests: Increase timeout in requests perf selftest · 7c287113
      Matthew Brost authored
      perf_parallel_engines is micro benchmark to test i915 request
      scheduling. The test creates a thread per physical engine and submits
      NOP requests and waits the requests to complete in a loop. In execlists
      mode this works perfectly fine as powerful CPU has enough cores to feed
      each engine and process the CSBs. With GuC submission the uC gets
      overwhelmed as all threads feed into a single CTB channel and the GuC
      gets bombarded with CSBs as contexts are immediately switched in and out
      on the engines due to the zero runtime of the requests. When the GuC is
      overwhelmed scheduling of contexts is unfair due to the nature of the
      GuC scheduling algorithm. This behavior is understood and deemed
      acceptable as this micro benchmark isn't close to real world use case.
      Increasing the timeout of wait period for requests to complete. This
      makes the test understand that is ok for contexts to get starved in this
      scenario.
      
      A future patch / cleanup may just delete these micro benchmark tests as
      they basically mean nothing. We care about real workloads not made up
      ones.
      Signed-off-by: default avatarMatthew Brost <matthew.brost@intel.com>
      Reviewed-by: default avatarJohn Harrison <John.C.Harrison@Intel.com>
      Reviewed-by: default avatarThomas Hellström <thomas.hellstrom@linux.intel.com>
      Signed-off-by: default avatarJohn Harrison <John.C.Harrison@Intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20211011175704.28509-1-matthew.brost@intel.com
      7c287113
    • Matthew Auld's avatar
      drm/i915/ttm: enable shmem tt backend · 5d12ffe6
      Matthew Auld authored
      Turn on the shmem tt backend, and enable shrinking.
      Signed-off-by: default avatarMatthew Auld <matthew.auld@intel.com>
      Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
      Reviewed-by: default avatarThomas Hellström <thomas.hellstrom@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20211018091055.1998191-8-matthew.auld@intel.com
      5d12ffe6
    • Matthew Auld's avatar
      drm/i915/ttm: use cached system pages when evicting lmem · 2eda4fc6
      Matthew Auld authored
      This should let us do an accelerated copy directly to the shmem pages
      when temporarily moving lmem-only objects, where the i915-gem shrinker
      can later kick in to swap out the pages, if needed.
      Signed-off-by: default avatarMatthew Auld <matthew.auld@intel.com>
      Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
      Reviewed-by: default avatarThomas Hellström <thomas.hellstrom@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20211018091055.1998191-7-matthew.auld@intel.com
      2eda4fc6
    • Matthew Auld's avatar
      drm/i915/ttm: move shrinker management into adjust_lru · ebd4a8ec
      Matthew Auld authored
      We currently just evict lmem objects to system memory when under memory
      pressure. For this case we might lack the usual object mm.pages, which
      effectively hides the pages from the i915-gem shrinker, until we
      actually "attach" the TT to the object, or in the case of lmem-only
      objects it just gets migrated back to lmem when touched again.
      
      For all cases we can just adjust the i915 shrinker LRU each time we also
      adjust the TTM LRU. The two cases we care about are:
      
        1) When something is moved by TTM, including when initially populating
           an object. Importantly this covers the case where TTM moves something from
           lmem <-> smem, outside of the normal get_pages() interface, which
           should still ensure the shmem pages underneath are reclaimable.
      
        2) When calling into i915_gem_object_unlock(). The unlock should
           ensure the object is removed from the shinker LRU, if it was indeed
           swapped out, or just purged, when the shrinker drops the object lock.
      
      v2(Thomas):
        - Handle managing the shrinker LRU in adjust_lru, where it is always
          safe to touch the object.
      v3(Thomas):
        - Pretty much a re-write. This time piggy back off the shrink_pin
          stuff, which actually seems to fit quite well for what we want here.
      v4(Thomas):
        - Just use a simple boolean for tracking ttm_shrinkable.
      v5:
        - Ensure we call adjust_lru when faulting the object, to ensure the
          pages are visible to the shrinker, if needed.
        - Add back the adjust_lru when in i915_ttm_move (Thomas)
      v6(Reported-by: kernel test robot <lkp@intel.com>):
        - Remove unused i915_tt
      Signed-off-by: default avatarMatthew Auld <matthew.auld@intel.com>
      Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
      Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> #v4
      Link: https://patchwork.freedesktop.org/patch/msgid/20211018091055.1998191-6-matthew.auld@intel.com
      ebd4a8ec
    • Matthew Auld's avatar
      drm/i915: add some kernel-doc for shrink_pin and friends · e25d1ea4
      Matthew Auld authored
      Attempt to document shrink_pin and the other relevant interfaces that
      interact with it, before we start messing with it.
      Signed-off-by: default avatarMatthew Auld <matthew.auld@intel.com>
      Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
      Reviewed-by: default avatarThomas Hellström <thomas.hellstrom@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20211018091055.1998191-5-matthew.auld@intel.com
      e25d1ea4
    • Matthew Auld's avatar
      drm/i915: drop unneeded make_unshrinkable in free_object · 893f11f0
      Matthew Auld authored
      The comment here is no longer accurate, since the current shrinker code
      requires a full ref before touching any objects. Also unset_pages()
      should already do the required make_unshrinkable() for us, if needed,
      which is also nicely balanced with set_pages().
      Signed-off-by: default avatarMatthew Auld <matthew.auld@intel.com>
      Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
      Reviewed-by: default avatarThomas Hellström <thomas.hellstrom@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20211018091055.1998191-4-matthew.auld@intel.com
      893f11f0
    • Matthew Auld's avatar
      drm/i915/gtt: drop unneeded make_unshrinkable · 5926ff80
      Matthew Auld authored
      We already do this when mapping the pages.
      Signed-off-by: default avatarMatthew Auld <matthew.auld@intel.com>
      Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
      Reviewed-by: default avatarThomas Hellström <thomas.hellstrom@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20211018091055.1998191-3-matthew.auld@intel.com
      5926ff80
    • Matthew Auld's avatar
      drm/i915/ttm: add tt shmem backend · 7ae03459
      Matthew Auld authored
      For cached objects we can allocate our pages directly in shmem. This
      should make it possible(in a later patch) to utilise the existing
      i915-gem shrinker code for such objects. For now this is still disabled.
      
      v2(Thomas):
        - Add optional try_to_writeback hook for objects. Importantly we need
          to check if the object is even still shrinkable; in between us
          dropping the shrinker LRU lock and acquiring the object lock it could for
          example have been moved. Also we need to differentiate between
          "lazy" shrinking and the immediate writeback mode. Also later we need to
          handle objects which don't even have mm.pages, so bundling this into
          put_pages() would require somehow handling that edge case, hence
          just letting the ttm backend handle everything in try_to_writeback
          doesn't seem too bad.
      v3(Thomas):
        - Likely a bad idea to touch the object from the unpopulate hook,
          since it's not possible to hold a reference, without also creating
          circular dependency, so likely this is too fragile. For now just
          ensure we at least mark the pages as dirty/accessed when called from the
          shrinker on WILLNEED objects.
        - s/try_to_writeback/shrinker_release_pages, since this can do more
          than just writeback.
        - Get rid of do_backup boolean and just set the SWAPPED flag prior to
          calling unpopulate.
        - Keep shmem_tt as lowest priority for the TTM LRU bo_swapout walk, since
          these just get skipped anyway. We can try to come up with something
          better later.
      v4(Thomas):
        - s/PCI_DMA/DMA/. Also drop NO_KERNEL_MAPPING and NO_WARN, which
          apparently doesn't do anything with streaming mappings.
        - Just pass along the error for ->truncate, and assume nothing.
      Signed-off-by: default avatarMatthew Auld <matthew.auld@intel.com>
      Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
      Cc: Christian König <christian.koenig@amd.com>
      Cc: Oak Zeng <oak.zeng@intel.com>
      Reviewed-by: default avatarThomas Hellström <thomas.hellstrom@linux.intel.com>
      Acked-by: default avatarOak Zeng <oak.zeng@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20211018091055.1998191-2-matthew.auld@intel.com
      7ae03459
    • Thomas Hellström's avatar
      drm/i915/gem: Break out some shmem backend utils · f05b985e
      Thomas Hellström authored
      Break out some shmem backend utils for future reuse by the TTM backend:
      shmem_alloc_st(), shmem_free_st() and __shmem_writeback() which we can
      use to provide a shmem-backed TTM page pool for cached-only TTM
      buffer objects.
      
      Main functional change here is that we now compute the page sizes using
      the dma segments rather than using the physical page address segments.
      
      v2(Reported-by: kernel test robot <lkp@intel.com>)
          - Make sure we initialise the mapping on the error path in
            shmem_get_pages()
      Signed-off-by: default avatarThomas Hellström <thomas.hellstrom@linux.intel.com>
      Reviewed-by: default avatarMatthew Auld <matthew.auld@intel.com>
      Signed-off-by: default avatarMatthew Auld <matthew.auld@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20211018091055.1998191-1-matthew.auld@intel.com
      f05b985e
    • Joonas Lahtinen's avatar
      Merge drm/drm-next into drm-intel-gt-next · ef3e6192
      Joonas Lahtinen authored
      Backmerging to pull in the new dma_resv iterators requested by
      Maarten and Matt.
      Signed-off-by: default avatarJoonas Lahtinen <joonas.lahtinen@linux.intel.com>
      ef3e6192
    • Matthew Auld's avatar
      drm/i915/dmabuf: fix broken build · 777226da
      Matthew Auld authored
      wbinvd_on_all_cpus() is only defined on x86 it seems, plus we need to
      include asm/smp.h here.
      Reported-by: default avatarkernel test robot <lkp@intel.com>
      Signed-off-by: default avatarMatthew Auld <matthew.auld@intel.com>
      Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
      Reviewed-by: default avatarAshutosh Dixit <ashutosh.dixit@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20211021125332.2455288-1-matthew.auld@intel.com
      777226da
  2. 21 Oct, 2021 2 commits
  3. 20 Oct, 2021 10 commits
  4. 18 Oct, 2021 3 commits
  5. 15 Oct, 2021 12 commits