An error occurred fetching the project authors.
  1. 29 Jun, 2022 1 commit
  2. 22 Jun, 2022 1 commit
  3. 20 Jun, 2022 1 commit
    • Thomas Hellström's avatar
      drm/i915: Improve on suspend / resume time with VT-d enabled · 2ef6efa7
      Thomas Hellström authored
      When DMAR / VT-d is enabled, the display engine uses overfetching,
      presumably to deal with the increased latency. To avoid display engine
      errors and DMAR faults, as a workaround the GGTT is populated with scatch
      PTEs when VT-d is enabled. However starting with gen10, Write-combined
      writing of scratch PTES is no longer possible and as a result, populating
      the full GGTT with scratch PTEs like on resume becomes very slow as
      uncached access is needed.
      
      Therefore, on integrated GPUs utilize the fact that the PTEs are stored in
      stolen memory which retain content across S3 suspend. Don't clear the PTEs
      on suspend and resume. This improves on resume time with around 100 ms.
      While 100+ms might appear like a short time it's 10% to 20% of total resume
      time and important in some applications.
      
      One notable exception is Intel Rapid Start Technology which may cause
      stolen memory to be lost across what the OS percieves as S3 suspend.
      If IRST is enabled or if we can't detect whether IRST is enabled, retain
      the old workaround, clearing and re-instating PTEs.
      
      As an additional measure, if we detect that the last ggtt pte was lost
      during suspend, print a warning and re-populate the GGTT ptes
      
      On discrete GPUs, the display engine scans out from LMEM which isn't
      subject to DMAR, and presumably the workaround is therefore not needed,
      but that needs to be verified and disabling the workaround for dGPU,
      if possible, will be deferred to a follow-up patch.
      
      v2:
      - Rely on retained ptes to also speed up suspend and resume re-binding.
      - Re-build GGTT ptes if Intel rst is enabled.
      v3:
      - Re-build GGTT ptes also if we can't detect whether Intel rst is enabled,
        and if the guard page PTE and end of GGTT was lost.
      v4:
      - Fix some kerneldoc issues (Matthew Auld), rebase.
      Signed-off-by: default avatarThomas Hellström <thomas.hellstrom@linux.intel.com>
      Acked-by: default avatarDaniel Vetter <daniel.vetter@ffwll.ch>
      Reviewed-by: default avatarMatthew Auld <matthew.auld@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20220617152856.249295-1-thomas.hellstrom@linux.intel.com
      2ef6efa7
  4. 06 Apr, 2022 1 commit
  5. 30 Mar, 2022 1 commit
  6. 07 Mar, 2022 1 commit
  7. 02 Feb, 2022 1 commit
  8. 18 Jan, 2022 2 commits
  9. 11 Jan, 2022 2 commits
    • Thomas Hellström's avatar
      drm/i915: Use vma resources for async unbinding · 2f6b90da
      Thomas Hellström authored
      Implement async (non-blocking) unbinding by not syncing the vma before
      calling unbind on the vma_resource.
      Add the resulting unbind fence to the object's dma_resv from where it is
      picked up by the ttm migration code.
      Ideally these unbind fences should be coalesced with the migration blit
      fence to avoid stalling the migration blit waiting for unbind, as they
      can certainly go on in parallel, but since we don't yet have a
      reasonable data structure to use to coalesce fences and attach the
      resulting fence to a timeline, we defer that for now.
      
      Note that with async unbinding, even while the unbind waits for the
      preceding bind to complete before unbinding, the vma itself might have been
      destroyed in the process, clearing the vma pages. Therefore we can
      only allow async unbinding if we have a refcounted sg-list and keep a
      refcount on that for the vma resource pages to stay intact until
      binding occurs. If this condition is not met, a request for an async
      unbind is diverted to a sync unbind.
      
      v2:
      - Use a separate kmem_cache for vma resources for now to isolate their
        memory allocation and aid debugging.
      - Move the check for vm closed to the actual unbinding thread. Regardless
        of whether the vm is closed, we need the unbind fence to properly wait
        for capture.
      - Clear vma_res::vm on unbind and update its documentation.
      v4:
      - Take cache coloring into account when searching for vma resources
        pending unbind. (Matthew Auld)
      v5:
      - Fix timeout and error check in i915_vma_resource_bind_dep_await().
      - Avoid taking a reference on the object for async binding if
        async unbind capable.
      - Fix braces around a single-line if statement.
      v6:
      - Fix up the cache coloring adjustment. (Kernel test robot <lkp@intel.com>)
      - Don't allow async unbinding if the vma_res pages are not the same as
        the object pages. (Matthew Auld)
      v7:
      - s/unsigned long/u64/ in a number of places (Matthew Auld)
      Signed-off-by: default avatarThomas Hellström <thomas.hellstrom@linux.intel.com>
      Reviewed-by: default avatarMatthew Auld <matthew.auld@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20220110172219.107131-5-thomas.hellstrom@linux.intel.com
      2f6b90da
    • Thomas Hellström's avatar
      drm/i915: Use the vma resource as argument for gtt binding / unbinding · 39a2bd34
      Thomas Hellström authored
      When introducing asynchronous unbinding, the vma itself may no longer
      be alive when the actual binding or unbinding takes place.
      
      Update the gtt i915_vma_ops accordingly to take a struct i915_vma_resource
      instead of a struct i915_vma for the bind_vma() and unbind_vma() ops.
      Similarly change the insert_entries() op for struct i915_address_space.
      
      Replace a couple of i915_vma_snapshot members with their newly introduced
      i915_vma_resource counterparts, since they have the same lifetime.
      
      Also make sure to avoid changing the struct i915_vma_flags (in particular
      the bind flags) async. That should now only be done sync under the
      vm mutex.
      
      v2:
      - Update the vma_res::bound_flags when binding to the aliased ggtt
      v6:
      - Remove I915_VMA_ALLOC_BIT (Matthew Auld)
      - Change some members of struct i915_vma_resource from unsigned long to u64
        (Matthew Auld)
      v7:
      - Fix vma resource size parameters to be u64 rather than unsigned long
        (Matthew Auld)
      Signed-off-by: default avatarThomas Hellström <thomas.hellstrom@linux.intel.com>
      Reviewed-by: default avatarMatthew Auld <matthew.auld@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20220110172219.107131-3-thomas.hellstrom@linux.intel.com
      39a2bd34
  10. 05 Jan, 2022 1 commit
  11. 20 Dec, 2021 1 commit
  12. 18 Dec, 2021 1 commit
  13. 09 Dec, 2021 1 commit
  14. 07 Dec, 2021 1 commit
  15. 01 Dec, 2021 1 commit
    • Tvrtko Ursulin's avatar
      drm/i915: Use per device iommu check · cca08469
      Tvrtko Ursulin authored
      With both integrated and discrete Intel GPUs in a system, the current
      global check of intel_iommu_gfx_mapped, as done from intel_vtd_active()
      may not be completely accurate.
      
      In this patch we add i915 parameter to intel_vtd_active() in order to
      prepare it for multiple GPUs and we also change the check away from Intel
      specific intel_iommu_gfx_mapped (global exported by the Intel IOMMU
      driver) to probing the presence of IOMMU on a specific device using
      device_iommu_mapped().
      
      This will return true both for IOMMU pass-through and address translation
      modes which matches the current behaviour. If in the future we wanted to
      distinguish between these two modes we could either use
      iommu_get_domain_for_dev() and check for __IOMMU_DOMAIN_PAGING bit
      indicating address translation, or ask for a new API to be exported from
      the IOMMU core code.
      
      v2:
        * Check for dmar translation specifically, not just iommu domain. (Baolu)
      
      v3:
       * Go back to plain "any domain" check for now, rewrite commit message.
      
      v4:
       * Use device_iommu_mapped. (Robin, Baolu)
      Signed-off-by: default avatarTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Cc: Lu Baolu <baolu.lu@linux.intel.com>
      Cc: Lucas De Marchi <lucas.demarchi@intel.com>
      Cc: Robin Murphy <robin.murphy@arm.com>
      Acked-by: default avatarRobin Murphy <robin.murphy@arm.com>
      Reviewed-by: default avatarLu Baolu <baolu.lu@linux.intel.com>
      Reviewed-by: default avatarLucas De Marchi <lucas.demarchi@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20211126141424.493753-1-tvrtko.ursulin@linux.intel.com
      cca08469
  16. 15 Nov, 2021 2 commits
  17. 09 Nov, 2021 1 commit
  18. 03 Nov, 2021 1 commit
  19. 02 Nov, 2021 3 commits
  20. 01 Oct, 2021 1 commit
  21. 24 Sep, 2021 1 commit
    • Thomas Hellström's avatar
      drm/i915: Reduce the number of objects subject to memcpy recover · a259cc14
      Thomas Hellström authored
      We really only need memcpy restore for objects that affect the
      operability of the migrate context. That is, primarily the page-table
      objects of the migrate VM.
      
      Add an object flag, I915_BO_ALLOC_PM_EARLY for objects that need early
      restores using memcpy and a way to assign LMEM page-table object flags
      to be used by the vms.
      
      Restore objects without this flag with the gpu blitter and only objects
      carrying the flag using TTM memcpy.
      
      Initially mark the migrate, gt, gtt and vgpu vms to use this flag, and
      defer for a later audit which vms actually need it. Most importantly, user-
      allocated vms with pinned page-table objects can be restored using the
      blitter.
      
      Performance-wise memcpy restore is probably as fast as gpu restore if not
      faster, but using gpu restore will help tackling future restrictions in
      mappable LMEM size.
      
      v4:
      - Don't mark the aliasing ppgtt page table flags for early resume, but
        rather the ggtt page table flags as intended. (Matthew Auld)
      - The check for user buffer objects during early resume is pointless, since
        they are never marked I915_BO_ALLOC_PM_EARLY. (Matthew Auld)
      v5:
      - Mark GuC LMEM objects with I915_BO_ALLOC_PM_EARLY to have them restored
        before we fire up the migrate context.
      
      Cc: Matthew Brost <matthew.brost@intel.com>
      Signed-off-by: default avatarThomas Hellström <thomas.hellstrom@linux.intel.com>
      Reviewed-by: default avatarMatthew Auld <matthew.auld@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20210922062527.865433-8-thomas.hellstrom@linux.intel.com
      a259cc14
  22. 23 Sep, 2021 1 commit
  23. 06 Sep, 2021 1 commit
    • Daniel Vetter's avatar
      drm/i915: Stop rcu support for i915_address_space · dcc5d820
      Daniel Vetter authored
      The full audit is quite a bit of work:
      
      - i915_dpt has very simple lifetime (somehow we create a display pagetable vm
        per object, so its _very_ simple, there's only ever a single vma in there),
        and uses i915_vm_close(), which internally does a i915_vm_put(). No rcu.
      
        Aside: wtf is i915_dpt doing in the intel_display.c garbage collector as a new
        feature, instead of added as a separate file with some clean-ish interface.
      
        Also, i915_dpt unfortunately re-introduces some coding patterns from
        pre-dma_resv_lock conversion times.
      
      - i915_gem_proto_ctx is fully refcounted and no rcu, all protected by
        fpriv->proto_context_lock.
      
      - i915_gem_context is itself rcu protected, and that might leak to anything it
        points at. Before
      
      	commit cf977e18
      	Author: Chris Wilson <chris@chris-wilson.co.uk>
      	Date:   Wed Dec 2 11:21:40 2020 +0000
      
      	    drm/i915/gem: Spring clean debugfs
      
        and
      
      	commit db80a129
      	Author: Chris Wilson <chris@chris-wilson.co.uk>
      	Date:   Mon Jan 18 11:08:54 2021 +0000
      
      	    drm/i915/gem: Remove per-client stats from debugfs/i915_gem_objects
      
        we had a bunch of debugfs files that relied on rcu protecting everything, but
        those are gone now. The main one was removed even earlier with
      
        There doesn't seem to be anything left that's actually protecting
        stuff now that the ctx->vm itself is invariant. See
      
      	commit ccbc1b97
      	Author: Jason Ekstrand <jason@jlekstrand.net>
      	Date:   Thu Jul 8 10:48:30 2021 -0500
      
      	    drm/i915/gem: Don't allow changing the VM on running contexts (v4)
      
        Note that we drop the vm refcount before the final release of the gem context
        refcount, so this is all very dangerous even without rcu. Note that aside from
        later on creating new engines (a defunct feature) and debug output we're never
        looked at gem_ctx->vm for anything functional, hence why this is ok.
        Fingers crossed.
      
        Preceeding patches removed all vestiges of rcu use from gem_ctx->vm
        derferencing to make it clear it's really not used.
      
        The gem_ctx->rcu protection was introduced in
      
      	commit a4e7ccda
      	Author: Chris Wilson <chris@chris-wilson.co.uk>
      	Date:   Fri Oct 4 14:40:09 2019 +0100
      
      	    drm/i915: Move context management under GEM
      
        The commit message is somewhat entertaining because it fails to
        mention this fact completely, and compensates that by an in-commit
        changelog entry that claims that ctx->vm is protected by ctx->mutex.
        Which was the case _before_ this commit, but no longer after it.
      
      - intel_context holds a full reference. Unfortunately intel_context is also rcu
        protected and the reference to the ->vm is dropped before the
        rcu barrier - only the kfree is delayed. So again we need to check
        whether that leaks anywhere on the intel_context->vm. RCU is only
        used to protect intel_context sitting on the breadcrumb lists, which
        don't look at the vm anywhere, so we are fine.
      
        Nothing else relies on rcu protection of intel_context and hence is
        fully protected by the kref refcount alone, which protects
        intel_context->vm in turn.
      
        The breadcrumbs rcu usage was added in
      
      	commit c744d503
      	Author: Chris Wilson <chris@chris-wilson.co.uk>
      	Date:   Thu Nov 26 14:04:06 2020 +0000
      
      	    drm/i915/gt: Split the breadcrumb spinlock between global and contexts
      
        its parent commit added the intel_context rcu protection:
      
      	commit 14d1eaf0
      	Author: Chris Wilson <chris@chris-wilson.co.uk>
      	Date:   Thu Nov 26 14:04:05 2020 +0000
      
      	    drm/i915/gt: Protect context lifetime with RCU
      
        given some credence to my claim that I've actually caught them all.
      
      - drm_i915_gem_object's shares_resv_from pointer has a full refcount to the
        dma_resv, which is a sub-refcount that's released after the final
        i915_vm_put() has been called. Safe.
      
        Aside: Maybe we should have a struct dma_resv_shared which is just dma_resv +
        kref as a stand-alone thing. It's a pretty useful pattern which other drivers
        might want to copy.
      
        For a bit more context see
      
      	commit 4d8151ae
      	Author: Thomas Hellström <thomas.hellstrom@linux.intel.com>
      	Date:   Tue Jun 1 09:46:41 2021 +0200
      
      	    drm/i915: Don't free shared locks while shared
      
      - the fpriv->vm_xa was relying on rcu_read_lock for lookup, but that
        was updated in a prep patch too to just be a spinlock-protected
        lookup.
      
      - intel_gt->vm is set at driver load in intel_gt_init() and released
        in intel_gt_driver_release(). There seems to be some issue that
        in some error paths this is called twice, but otherwise no rcu to be
        found anywhere. This was added in the below commit, which
        unfortunately doesn't explain why this complication exists.
      
      	commit e6ba7648
      	Author: Chris Wilson <chris@chris-wilson.co.uk>
      	Date:   Sat Dec 21 16:03:24 2019 +0000
      
      	    drm/i915: Remove i915->kernel_context
      
        The proper fix most likely for this is to start using drmm_ at large
        scale, but that's also huge amounts of work.
      
      - i915_vma->vm is some real pain, because rcu is rcu protected, at
        least in the vma lookup in the context lookup cache in
        eb_lookup_vma(). This was added in
      
      	commit 4ff4b44c
      	Author: Chris Wilson <chris@chris-wilson.co.uk>
      	Date:   Fri Jun 16 15:05:16 2017 +0100
      
      	    drm/i915: Store a direct lookup from object handle to vma
      
        This was changed to a radix tree from the hashtable in, but with the
        locking unchanged, in
      
      	commit d1b48c1e
      	Author: Chris Wilson <chris@chris-wilson.co.uk>
      	Date:   Wed Aug 16 09:52:08 2017 +0100
      
      	    drm/i915: Replace execbuf vma ht with an idr
      
        In
      
      	commit 93159e12
      	Author: Chris Wilson <chris@chris-wilson.co.uk>
      	Date:   Mon Mar 23 09:28:41 2020 +0000
      
      	    drm/i915/gem: Avoid gem_context->mutex for simple vma lookup
      
        the locking was changed from dev->struct_mutex to rcu, which added
        the requirement to rcu protect i915_vma. Somehow this was missed in
        review (or I'm completely blind).
      
        Irrespective of all that the vma lookup cache rcu_read_lock grabs a
        full reference of the vma and the rcu doesn't leak further. So no
        impact on i915_address_space from that.
      
        I have not found any other rcu use for i915_vma, but given that it
        seems broken I also didn't bother to do a careful in-depth audit.
      
      Alltogether there's nothing left in-tree anymore which requires that a
      pointer deref to an i915_address_space is safe undre rcu_read_lock
      only.
      
      rcu protection of i915_address_space was introduced in
      
      commit b32fa811
      Author: Chris Wilson <chris@chris-wilson.co.uk>
      Date:   Thu Jun 20 19:37:05 2019 +0100
      
          drm/i915/gtt: Defer address space cleanup to an RCU worker
      
      by mixing up a bugfixing (i915_address_space needs to be released from
      a worker) with enabling rcu support. The commit message also seems
      somewhat confused, because it talks about cleanup of WC pages
      requiring sleep, while the code and linked bugzilla are about a
      requirement to take dev->struct_mutex (which yes sleeps but it's a
      much more specific problem). Since final kref_put can be called from
      pretty much anywhere (including hardirq context through the
      scheduler's i915_active cleanup) we need a worker here. Hence that
      part must be kept.
      
      Ideally all these reclaim workers should have some kind of integration
      with our shrinkers, but for some of these it's rather tricky. Anyway,
      that's a preexisting condition in the codeebase that we wont fix in
      this patch here.
      
      We also remove the rcu_barrier in ggtt_cleanup_hw added in
      
      commit 60a4233a
      Author: Chris Wilson <chris@chris-wilson.co.uk>
      Date:   Mon Jul 29 14:24:12 2019 +0100
      
          drm/i915: Flush the i915_vm_release before ggtt shutdown
      Reviewed-by: default avatarMaarten Lankhorst <maarten.lankhorst@linux.intel.com>
      Signed-off-by: default avatarDaniel Vetter <daniel.vetter@intel.com>
      Cc: Jon Bloomfield <jon.bloomfield@intel.com>
      Cc: Chris Wilson <chris@chris-wilson.co.uk>
      Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Cc: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>
      Cc: Matthew Auld <matthew.auld@intel.com>
      Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
      Cc: Dave Airlie <airlied@redhat.com>
      Cc: Jason Ekstrand <jason@jlekstrand.net>
      Link: https://patchwork.freedesktop.org/patch/msgid/20210902142057.929669-11-daniel.vetter@ffwll.ch
      dcc5d820
  24. 29 Jul, 2021 1 commit
  25. 16 Jul, 2021 1 commit
  26. 05 Jun, 2021 1 commit
  27. 01 Jun, 2021 1 commit
  28. 07 May, 2021 1 commit
  29. 29 Apr, 2021 1 commit
    • Maarten Lankhorst's avatar
      drm/i915: Use trylock in shrinker for ggtt on bsw vt-d and bxt, v2. · bc6f80cc
      Maarten Lankhorst authored
      The stop_machine() lock may allocate memory, but is called inside
      vm->mutex, which is taken in the shrinker. This will cause a lockdep
      splat, as can be seen below:
      
      <4>[  462.585762] ======================================================
      <4>[  462.585768] WARNING: possible circular locking dependency detected
      <4>[  462.585773] 5.12.0-rc5-CI-Trybot_7644+ #1 Tainted: G     U
      <4>[  462.585779] ------------------------------------------------------
      <4>[  462.585783] i915_selftest/5540 is trying to acquire lock:
      <4>[  462.585788] ffffffff826440b0 (cpu_hotplug_lock){++++}-{0:0}, at: stop_machine+0x12/0x30
      <4>[  462.585814]
                        but task is already holding lock:
      <4>[  462.585818] ffff888125369c70 (&vm->mutex/1){+.+.}-{3:3}, at: i915_vma_pin_ww+0x38e/0xb40 [i915]
      <4>[  462.586301]
                        which lock already depends on the new lock.
      
      <4>[  462.586305]
                        the existing dependency chain (in reverse order) is:
      <4>[  462.586309]
                        -> #2 (&vm->mutex/1){+.+.}-{3:3}:
      <4>[  462.586323]        i915_gem_shrinker_taints_mutex+0x2d/0x50 [i915]
      <4>[  462.586719]        i915_address_space_init+0x12d/0x130 [i915]
      <4>[  462.587092]        ppgtt_init+0x4e/0x80 [i915]
      <4>[  462.587467]        gen8_ppgtt_create+0x3e/0x5c0 [i915]
      <4>[  462.587828]        i915_ppgtt_create+0x28/0xf0 [i915]
      <4>[  462.588203]        intel_gt_init+0x123/0x370 [i915]
      <4>[  462.588572]        i915_gem_init+0x129/0x1f0 [i915]
      <4>[  462.588971]        i915_driver_probe+0x753/0xd80 [i915]
      <4>[  462.589320]        i915_pci_probe+0x43/0x1d0 [i915]
      <4>[  462.589671]        pci_device_probe+0x9e/0x110
      <4>[  462.589680]        really_probe+0xea/0x410
      <4>[  462.589690]        driver_probe_device+0xd9/0x140
      <4>[  462.589697]        device_driver_attach+0x4a/0x50
      <4>[  462.589704]        __driver_attach+0x83/0x140
      <4>[  462.589711]        bus_for_each_dev+0x75/0xc0
      <4>[  462.589718]        bus_add_driver+0x14b/0x1f0
      <4>[  462.589724]        driver_register+0x66/0xb0
      <4>[  462.589731]        i915_init+0x70/0x87 [i915]
      <4>[  462.590053]        do_one_initcall+0x56/0x2e0
      <4>[  462.590061]        do_init_module+0x55/0x200
      <4>[  462.590068]        load_module+0x2703/0x2990
      <4>[  462.590074]        __do_sys_finit_module+0xad/0x110
      <4>[  462.590080]        do_syscall_64+0x33/0x80
      <4>[  462.590089]        entry_SYSCALL_64_after_hwframe+0x44/0xae
      <4>[  462.590096]
                        -> #1 (fs_reclaim){+.+.}-{0:0}:
      <4>[  462.590109]        fs_reclaim_acquire+0x9f/0xd0
      <4>[  462.590118]        kmem_cache_alloc_trace+0x3d/0x430
      <4>[  462.590126]        intel_cpuc_prepare+0x3b/0x1b0
      <4>[  462.590133]        cpuhp_invoke_callback+0x9e/0x890
      <4>[  462.590141]        _cpu_up+0xa4/0x130
      <4>[  462.590147]        cpu_up+0x82/0x90
      <4>[  462.590153]        bringup_nonboot_cpus+0x4a/0x60
      <4>[  462.590159]        smp_init+0x21/0x5c
      <4>[  462.590167]        kernel_init_freeable+0x8a/0x1b7
      <4>[  462.590175]        kernel_init+0x5/0xff
      <4>[  462.590181]        ret_from_fork+0x22/0x30
      <4>[  462.590187]
                        -> #0 (cpu_hotplug_lock){++++}-{0:0}:
      <4>[  462.590199]        __lock_acquire+0x1520/0x2590
      <4>[  462.590207]        lock_acquire+0xd1/0x3d0
      <4>[  462.590213]        cpus_read_lock+0x39/0xc0
      <4>[  462.590219]        stop_machine+0x12/0x30
      <4>[  462.590226]        bxt_vtd_ggtt_insert_entries__BKL+0x36/0x50 [i915]
      <4>[  462.590601]        ggtt_bind_vma+0x5d/0x80 [i915]
      <4>[  462.590970]        i915_vma_bind+0xdc/0x1c0 [i915]
      <4>[  462.591374]        i915_vma_pin_ww+0x435/0xb40 [i915]
      <4>[  462.591779]        make_obj_busy+0xcb/0x330 [i915]
      <4>[  462.592170]        igt_mmap_offset_exhaustion+0x45f/0x4c0 [i915]
      <4>[  462.592562]        __i915_subtests.cold.7+0x42/0x92 [i915]
      <4>[  462.592995]        __run_selftests.part.3+0x10d/0x172 [i915]
      <4>[  462.593428]        i915_live_selftests.cold.5+0x1f/0x47 [i915]
      <4>[  462.593860]        i915_pci_probe+0x93/0x1d0 [i915]
      <4>[  462.594210]        pci_device_probe+0x9e/0x110
      <4>[  462.594217]        really_probe+0xea/0x410
      <4>[  462.594226]        driver_probe_device+0xd9/0x140
      <4>[  462.594233]        device_driver_attach+0x4a/0x50
      <4>[  462.594240]        __driver_attach+0x83/0x140
      <4>[  462.594247]        bus_for_each_dev+0x75/0xc0
      <4>[  462.594254]        bus_add_driver+0x14b/0x1f0
      <4>[  462.594260]        driver_register+0x66/0xb0
      <4>[  462.594267]        i915_init+0x70/0x87 [i915]
      <4>[  462.594586]        do_one_initcall+0x56/0x2e0
      <4>[  462.594592]        do_init_module+0x55/0x200
      <4>[  462.594599]        load_module+0x2703/0x2990
      <4>[  462.594605]        __do_sys_finit_module+0xad/0x110
      <4>[  462.594612]        do_syscall_64+0x33/0x80
      <4>[  462.594618]        entry_SYSCALL_64_after_hwframe+0x44/0xae
      <4>[  462.594625]
                        other info that might help us debug this:
      
      <4>[  462.594629] Chain exists of:
                          cpu_hotplug_lock --> fs_reclaim --> &vm->mutex/1
      
      <4>[  462.594645]  Possible unsafe locking scenario:
      
      <4>[  462.594648]        CPU0                    CPU1
      <4>[  462.594652]        ----                    ----
      <4>[  462.594655]   lock(&vm->mutex/1);
      <4>[  462.594664]                                lock(fs_reclaim);
      <4>[  462.594671]                                lock(&vm->mutex/1);
      <4>[  462.594679]   lock(cpu_hotplug_lock);
      <4>[  462.594686]
                         *** DEADLOCK ***
      
      <4>[  462.594690] 4 locks held by i915_selftest/5540:
      <4>[  462.594696]  #0: ffff888100fbc240 (&dev->mutex){....}-{3:3}, at: device_driver_attach+0x18/0x50
      <4>[  462.594715]  #1: ffffc900006cb9a0 (reservation_ww_class_acquire){+.+.}-{0:0}, at: make_obj_busy+0x81/0x330 [i915]
      <4>[  462.595118]  #2: ffff88812a6081e8 (reservation_ww_class_mutex){+.+.}-{3:3}, at: make_obj_busy+0x21f/0x330 [i915]
      <4>[  462.595519]  #3: ffff888125369c70 (&vm->mutex/1){+.+.}-{3:3}, at: i915_vma_pin_ww+0x38e/0xb40 [i915]
      <4>[  462.595934]
                        stack backtrace:
      <4>[  462.595939] CPU: 0 PID: 5540 Comm: i915_selftest Tainted: G     U            5.12.0-rc5-CI-Trybot_7644+ #1
      <4>[  462.595947] Hardware name: GOOGLE Kefka/Kefka, BIOS MrChromebox 02/04/2018
      <4>[  462.595952] Call Trace:
      <4>[  462.595961]  dump_stack+0x7f/0xad
      <4>[  462.595974]  check_noncircular+0x12e/0x150
      <4>[  462.595982]  ? save_stack.isra.17+0x3f/0x70
      <4>[  462.595991]  ? drm_mm_insert_node_in_range+0x34a/0x5b0
      <4>[  462.596000]  ? i915_vma_pin_ww+0x9ec/0xb40 [i915]
      <4>[  462.596410]  __lock_acquire+0x1520/0x2590
      <4>[  462.596419]  ? do_init_module+0x55/0x200
      <4>[  462.596429]  lock_acquire+0xd1/0x3d0
      <4>[  462.596435]  ? stop_machine+0x12/0x30
      <4>[  462.596445]  ? gen8_ggtt_insert_entries+0xf0/0xf0 [i915]
      <4>[  462.596816]  cpus_read_lock+0x39/0xc0
      <4>[  462.596824]  ? stop_machine+0x12/0x30
      <4>[  462.596831]  stop_machine+0x12/0x30
      <4>[  462.596839]  bxt_vtd_ggtt_insert_entries__BKL+0x36/0x50 [i915]
      <4>[  462.597210]  ggtt_bind_vma+0x5d/0x80 [i915]
      <4>[  462.597580]  i915_vma_bind+0xdc/0x1c0 [i915]
      <4>[  462.597986]  i915_vma_pin_ww+0x435/0xb40 [i915]
      <4>[  462.598395]  ? make_obj_busy+0xcb/0x330 [i915]
      <4>[  462.598786]  make_obj_busy+0xcb/0x330 [i915]
      <4>[  462.599180]  ? 0xffffffff81000000
      <4>[  462.599187]  ? debug_mutex_unlock+0x50/0xa0
      <4>[  462.599198]  igt_mmap_offset_exhaustion+0x45f/0x4c0 [i915]
      <4>[  462.599592]  __i915_subtests.cold.7+0x42/0x92 [i915]
      <4>[  462.600026]  ? i915_perf_selftests+0x20/0x20 [i915]
      <4>[  462.600422]  ? __i915_nop_setup+0x10/0x10 [i915]
      <4>[  462.600820]  __run_selftests.part.3+0x10d/0x172 [i915]
      <4>[  462.601253]  i915_live_selftests.cold.5+0x1f/0x47 [i915]
      <4>[  462.601686]  i915_pci_probe+0x93/0x1d0 [i915]
      <4>[  462.602037]  ? _raw_spin_unlock_irqrestore+0x3d/0x60
      <4>[  462.602047]  pci_device_probe+0x9e/0x110
      <4>[  462.602057]  really_probe+0xea/0x410
      <4>[  462.602067]  driver_probe_device+0xd9/0x140
      <4>[  462.602075]  device_driver_attach+0x4a/0x50
      <4>[  462.602084]  __driver_attach+0x83/0x140
      <4>[  462.602091]  ? device_driver_attach+0x50/0x50
      <4>[  462.602099]  ? device_driver_attach+0x50/0x50
      <4>[  462.602107]  bus_for_each_dev+0x75/0xc0
      <4>[  462.602116]  bus_add_driver+0x14b/0x1f0
      <4>[  462.602124]  driver_register+0x66/0xb0
      <4>[  462.602133]  i915_init+0x70/0x87 [i915]
      <4>[  462.602453]  ? 0xffffffffa0606000
      <4>[  462.602458]  do_one_initcall+0x56/0x2e0
      <4>[  462.602466]  ? kmem_cache_alloc_trace+0x374/0x430
      <4>[  462.602476]  do_init_module+0x55/0x200
      <4>[  462.602484]  load_module+0x2703/0x2990
      <4>[  462.602500]  ? __do_sys_finit_module+0xad/0x110
      <4>[  462.602507]  __do_sys_finit_module+0xad/0x110
      <4>[  462.602519]  do_syscall_64+0x33/0x80
      <4>[  462.602527]  entry_SYSCALL_64_after_hwframe+0x44/0xae
      <4>[  462.602535] RIP: 0033:0x7fab69d8d89d
      
      Changes since v1:
      - Add lockdep annotations during init, to ensure that lockdep is primed.
        This also fixes a false positive when reading /proc/lockdep_stats
        during module reload.
      Signed-off-by: default avatarMaarten Lankhorst <maarten.lankhorst@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20210426102351.921874-1-maarten.lankhorst@linux.intel.comReviewed-by: default avatarThomas Hellström <thomas.hellstrom@linux.intel.com>
      bc6f80cc
  30. 27 Apr, 2021 1 commit
    • Matthew Auld's avatar
      drm/i915/gtt: map the PD up front · 529b9ec8
      Matthew Auld authored
      We need to generalise our accessor for the page directories and tables from
      using the simple kmap_atomic to support local memory, and this setup
      must be done on acquisition of the backing storage prior to entering
      fence execution contexts. Here we replace the kmap with the object
      mapping code that for simple single page shmemfs object will return a
      plain kmap, that is then kept for the lifetime of the page directory.
      
      Note that keeping the mapping around is a potential concern here, since
      while the vma is pinned the mapping remains there for the PDs
      underneath, or at least until the used_count reaches zero, at which
      point we can safely destroy the mapping. For 32b this will be even worse
      since the address space is more limited, but since this change mostly
      impacts full ppGTT platforms, the justification is that for modern
      platforms we shouldn't care too much about 32b.
      Signed-off-by: default avatarMatthew Auld <matthew.auld@intel.com>
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
      Reviewed-by: default avatarTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20210427085417.120246-3-matthew.auld@intel.com
      529b9ec8
  31. 29 Mar, 2021 2 commits
  32. 24 Mar, 2021 3 commits