Commits · a06968563775181690125091f470a8655742dcbf · Kirill Smelkov / linux

An error occurred fetching the project authors.

29 Jun, 2022 1 commit

drm/i915: Fix a lockdep warning at error capture · a0696856

Nirmoy Das authored 2 years ago

For some platfroms we use stop_machine version of
gen8_ggtt_insert_page/gen8_ggtt_insert_entries to avoid a
concurrent GGTT access bug but this causes a circular locking
dependency warning:

  Possible unsafe locking scenario:
        CPU0                    CPU1
        ----                    ----
   lock(&ggtt->error_mutex);
                                lock(dma_fence_map);
                                lock(&ggtt->error_mutex);
   lock(cpu_hotplug_lock);

Fix this by calling gen8_ggtt_insert_page/gen8_ggtt_insert_entries
directly at error capture which is concurrent GGTT access safe because
reset path make sure of that.

v2: Fix rebase conflict and added a comment.

Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/5595Reviewed-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>
Signed-off-by: Ramalingam C <ramalingam.c@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220624110821.29190-1-nirmoy.das@intel.com

a0696856

22 Jun, 2022 1 commit

drm/i915/gt: Re-do the intel-gtt split · 9ce07d94

Lucas De Marchi authored 2 years ago

Re-do what was attempted in commit 7a5c9223 ("drm/i915/gt: Split
intel-gtt functions by arch"). The goal of that commit was to split the
handlers for older hardware that depend on intel-gtt.ko so i915 can
be built for non-x86 archs, after some more patches. Other archs do not
need intel-gtt.ko.

Main issue with the previous approach: it moved all the hooks, including
the gen8, which is used by all platforms gen8 and newer.  Re-do the
split moving only the handlers for gen < 6, which are the only ones
calling out to the separate module.

While at it do some minor cleanups:
  - Rename the prefix s/gen5_/gmch_/ to be more accurate what platforms
    are covered by intel_ggtt_gmch.c
  - Remove dead code for gen12 out of needs_idle_maps()
  - Remove TODO comment leftover
  - Re-order if/else ladder in ggtt_probe_hw() to keep newest platforms
    first

v2: Add minor cleanups (Matt Roper)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220617230559.2109427-2-lucas.demarchi@intel.com

9ce07d94

20 Jun, 2022 1 commit

drm/i915: Improve on suspend / resume time with VT-d enabled · 2ef6efa7

Thomas Hellström authored 2 years ago

When DMAR / VT-d is enabled, the display engine uses overfetching,
presumably to deal with the increased latency. To avoid display engine
errors and DMAR faults, as a workaround the GGTT is populated with scatch
PTEs when VT-d is enabled. However starting with gen10, Write-combined
writing of scratch PTES is no longer possible and as a result, populating
the full GGTT with scratch PTEs like on resume becomes very slow as
uncached access is needed.

Therefore, on integrated GPUs utilize the fact that the PTEs are stored in
stolen memory which retain content across S3 suspend. Don't clear the PTEs
on suspend and resume. This improves on resume time with around 100 ms.
While 100+ms might appear like a short time it's 10% to 20% of total resume
time and important in some applications.

One notable exception is Intel Rapid Start Technology which may cause
stolen memory to be lost across what the OS percieves as S3 suspend.
If IRST is enabled or if we can't detect whether IRST is enabled, retain
the old workaround, clearing and re-instating PTEs.

As an additional measure, if we detect that the last ggtt pte was lost
during suspend, print a warning and re-populate the GGTT ptes

On discrete GPUs, the display engine scans out from LMEM which isn't
subject to DMAR, and presumably the workaround is therefore not needed,
but that needs to be verified and disabling the workaround for dGPU,
if possible, will be deferred to a follow-up patch.

v2:
- Rely on retained ptes to also speed up suspend and resume re-binding.
- Re-build GGTT ptes if Intel rst is enabled.
v3:
- Re-build GGTT ptes also if we can't detect whether Intel rst is enabled,
and if the guard page PTE and end of GGTT was lost.
v4:
- Fix some kerneldoc issues (Matthew Auld), rebase.
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220617152856.249295-1-thomas.hellstrom@linux.intel.com

2ef6efa7

06 Apr, 2022 1 commit

drm/i915/gt: Split intel-gtt functions by arch · 7a5c9223

Casey Bowman authored 2 years ago

Some functions defined in the intel-gtt module are used in several
areas, but is only supported on x86 platforms.

By separating these calls and their static underlying functions to
another area, we are able to compile out these functions for
non-x86 builds and provide stubs for the non-x86 implementations.

In addition to the problematic calls, we are moving the gmch-related
functions to the new area.
Signed-off-by: Casey Bowman <casey.g.bowman@intel.com>
Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220330234809.1218210-2-casey.g.bowman@intel.com

7a5c9223

30 Mar, 2022 1 commit

drm/i915: Move intel_vtd_active and run_as_guest to i915_utils · a7f46d5b

Tvrtko Ursulin authored 2 years ago

Continuation of the effort to declutter i915_drv.h.

Also, component specific helpers which consult the iommu/virtualization
helpers moved to respective component source/header files as appropriate.

v2:
 * s/dev_priv/i915/ in intel_scanout_needs_vtd_wa. (Lucas)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Acked-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220329090204.2324499-1-tvrtko.ursulin@linux.intel.com
[tursulin: fixup conflict in i915_drv.h]

a7f46d5b

07 Mar, 2022 1 commit

drm/i915: Remove the vm open count · e1a7ab4f

Thomas Hellström authored 3 years ago

vms are not getting properly closed. Rather than fixing that,
Remove the vm open count and instead rely on the vm refcount.

The vm open count existed solely to break the strong references the
vmas had on the vms. Now instead make those references weak and
ensure vmas are destroyed when the vm is destroyed.

Unfortunately if the vm destructor and the object destructor both
wants to destroy a vma, that may lead to a race in that the vm
destructor just unbinds the vma and leaves the actual vma destruction
to the object destructor. However in order for the object destructor
to ensure the vma is unbound it needs to grab the vm mutex. In order
to keep the vm mutex alive until the object destructor is done with
it, somewhat hackishly grab a vm_resv refcount that is released late
in the vma destruction process, when the vm mutex is no longer needed.

v2: Address review-comments from Niranjana
- Clarify that the struct i915_address_space::skip_pte_rewrite is a hack
and should ideally be replaced in an upcoming patch.
- Remove an unneeded continue in clear_vm_list and update comment.

v3:
- Documentation update
- Commit message formatting
Co-developed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220304082641.308069-2-thomas.hellstrom@linux.intel.com

e1a7ab4f

02 Feb, 2022 1 commit

drm/i915: Move GT registers to their own header file · 0d6419e9

Matt Roper authored 3 years ago

This is a huge, chaotic mass of registers copied over as-is without any
real cleanup.  We'll come back and organize these better, align on
consistent coding style, remove dead code, etc. in separate patches
later that will be easier to review.

v2:
 - Add missing include in intel_pxp_irq.c
v3:
 - Correct a few indentation errors (Lucas)
 - Minor conflict resolution

Cc: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220127234334.4016964-6-matthew.d.roper@intel.com

0d6419e9

18 Jan, 2022 2 commits

drm/i915: Add i915_vma_unbind_unlocked, and take obj lock for i915_vma_unbind, v2. · 0f341974

Maarten Lankhorst authored 3 years ago

We want to remove more members of i915_vma, which requires the locking to
be held more often.

Start requiring gem object lock for i915_vma_unbind, as it's one of the
callers that may unpin pages.

Some special care is needed when evicting, because the last reference to
the object may be held by the VMA, so after __i915_vma_unbind, vma may be
garbage, and we need to cache vma->obj before unlocking.

Changes since v1:
- Make trylock failing a WARN. (Matt)
- Remove double i915_vma_wait_for_bind() (Matt)
- Move atomic_set to right before mutex_unlock(), to make it more clear
  they belong together. (Matt)
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220114132320.109030-5-maarten.lankhorst@linux.intel.com

0f341974

drm/i915: Add object locking to i915_gem_evict_for_node and i915_gem_evict_something, v2. · 7e00897b

Maarten Lankhorst authored 3 years ago

Because we will start to require the obj->resv lock for unbinding,
ensure these vma eviction utility functions also take the lock.

This requires some function signature changes, to ensure that the
ww context is passed around, but is mostly straightforward.

Previously this was split up into several patches, but reworking
should allow for easier bisection.

Changes since v1:
- Handle evicting dead objects better.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220114132320.109030-4-maarten.lankhorst@linux.intel.com

7e00897b

11 Jan, 2022 2 commits

drm/i915: Use vma resources for async unbinding · 2f6b90da

Thomas Hellström authored 3 years ago

Implement async (non-blocking) unbinding by not syncing the vma before
calling unbind on the vma_resource.
Add the resulting unbind fence to the object's dma_resv from where it is
picked up by the ttm migration code.
Ideally these unbind fences should be coalesced with the migration blit
fence to avoid stalling the migration blit waiting for unbind, as they
can certainly go on in parallel, but since we don't yet have a
reasonable data structure to use to coalesce fences and attach the
resulting fence to a timeline, we defer that for now.

Note that with async unbinding, even while the unbind waits for the
preceding bind to complete before unbinding, the vma itself might have been
destroyed in the process, clearing the vma pages. Therefore we can
only allow async unbinding if we have a refcounted sg-list and keep a
refcount on that for the vma resource pages to stay intact until
binding occurs. If this condition is not met, a request for an async
unbind is diverted to a sync unbind.

v2:
- Use a separate kmem_cache for vma resources for now to isolate their
memory allocation and aid debugging.
- Move the check for vm closed to the actual unbinding thread. Regardless
of whether the vm is closed, we need the unbind fence to properly wait
for capture.
- Clear vma_res::vm on unbind and update its documentation.
v4:
- Take cache coloring into account when searching for vma resources
pending unbind. (Matthew Auld)
v5:
- Fix timeout and error check in i915_vma_resource_bind_dep_await().
- Avoid taking a reference on the object for async binding if
async unbind capable.
- Fix braces around a single-line if statement.
v6:
- Fix up the cache coloring adjustment. (Kernel test robot <lkp@intel.com>)
- Don't allow async unbinding if the vma_res pages are not the same as
the object pages. (Matthew Auld)
v7:
- s/unsigned long/u64/ in a number of places (Matthew Auld)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220110172219.107131-5-thomas.hellstrom@linux.intel.com

2f6b90da

drm/i915: Use the vma resource as argument for gtt binding / unbinding · 39a2bd34

Thomas Hellström authored 3 years ago

When introducing asynchronous unbinding, the vma itself may no longer
be alive when the actual binding or unbinding takes place.

Update the gtt i915_vma_ops accordingly to take a struct i915_vma_resource
instead of a struct i915_vma for the bind_vma() and unbind_vma() ops.
Similarly change the insert_entries() op for struct i915_address_space.

Replace a couple of i915_vma_snapshot members with their newly introduced
i915_vma_resource counterparts, since they have the same lifetime.

Also make sure to avoid changing the struct i915_vma_flags (in particular
the bind flags) async. That should now only be done sync under the
vm mutex.

v2:
- Update the vma_res::bound_flags when binding to the aliased ggtt
v6:
- Remove I915_VMA_ALLOC_BIT (Matthew Auld)
- Change some members of struct i915_vma_resource from unsigned long to u64
  (Matthew Auld)
v7:
- Fix vma resource size parameters to be u64 rather than unsigned long
  (Matthew Auld)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220110172219.107131-3-thomas.hellstrom@linux.intel.com

39a2bd34

05 Jan, 2022 1 commit

drm/i915/gt: Use to_gt() helper for GGTT accesses · 848915c3

Michał Winiarski authored 3 years ago

GGTT is currently available both through i915->ggtt and gt->ggtt, and we
eventually want to get rid of the i915->ggtt one.
Use to_gt() for all i915->ggtt accesses to help with the future
refactoring.

During the probe of i915 the early intiialization of the gt
(intel_gt_init_hw_early()) is moved prior to any access to the
ggtt. This because it's in that moment we assign the ggtt to the
gt and we want to do that before using it.
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Reviewed-by: Sujaritha Sundaresan <sujaritha.sundaresan@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211221195946.3180-1-andi.shyti@linux.intel.com

848915c3

20 Dec, 2021 1 commit

drm/i915: Remove pages_mutex and intel_gtt->vma_ops.set/clear_pages members, v3. · 0b4d1f0e

Maarten Lankhorst authored 3 years ago

Big delta, but boils down to moving set_pages to i915_vma.c, and removing
the special handling, all callers use the defaults anyway. We only remap
in ggtt, so default case will fall through.

Because we still don't require locking in i915_vma_unpin(), handle this by
using xchg in get_pages(), as it's locked with obj->mutex, and cmpxchg in
unpin, which only fails if we race a against a new pin.

Changes since v1:
- aliasing gtt sets ZERO_SIZE_PTR, not -ENODEV, remove special case
  from __i915_vma_get_pages(). (Matt)
Changes since v2:
- Free correct old pages in __i915_vma_get_pages(). (Matt)
  Remove race of clearing vma->pages accidentally from put,
  free it but leave it set, as only get has the lock.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211216142749.1966107-4-maarten.lankhorst@linux.intel.comReviewed-by: Matthew Auld <matthew.auld@intel.com>

0b4d1f0e

18 Dec, 2021 1 commit

drm/i915/gt: Use to_gt() helper · c14adcbd

Michał Winiarski authored 3 years ago

Use to_gt() helper consistently throughout the codebase.
Pure mechanical s/i915->gt/to_gt(i915). No functional changes.
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211214193346.21231-5-andi.shyti@linux.intel.com

c14adcbd

09 Dec, 2021 1 commit

drm/i915/gtt/xehpsdv: move scratch page to system memory · fef53be0

Matthew Auld authored 3 years ago

On some platforms the hw has dropped support for 4K GTT pages when
dealing with LMEM, and due to the design of 64K GTT pages in the hw, we
can only mark the *entire* page-table as operating in 64K GTT mode,
since the enable bit is still on the pde, and not the pte. And since we
we still need to allow 4K GTT pages for SMEM objects, we can't have a
"normal" 4K page-table with scratch pointing to LMEM, since that's
undefined from the hw pov. The simplest solution is to just move the 64K
scratch page to SMEM on such platforms and call it a day, since that
should work for all configurations.
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Ramalingam C <ramalingam.c@intel.com>
Reviewed-by: Thomas Hellstrom <thomas.hellstrom@linux.intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211208141613.7251-4-ramalingam.c@intel.com

fef53be0

07 Dec, 2021 1 commit

drm/i915: Introduce new macros for i915 PTE · 5f978167

Michael Cheng authored 3 years ago

Certain functions within i915 uses macros that are defined for
specific architectures by the mmu, such as _PAGE_RW and _PAGE_PRESENT
(Some architectures don't even have these macros defined, like ARM64).

Instead of re-using bits defined for the CPU, we should use bits
defined for i915. This patch introduces two new 64 bit macros,
GEN8_PAGE_PRESENT and GEN8_PAGE_RW, to check for bits 0 and 1 and, to
replace all occurrences of _PAGE_RW and _PAGE_PRESENT within i915.

v2(Michael Cheng): Use GEN8_ instead of I915_
Signed-off-by: Michael Cheng <michael.cheng@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
[ Move defines together with other GEN8 defines ]
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211206215245.513677-2-michael.cheng@intel.com

5f978167

01 Dec, 2021 1 commit

drm/i915: Use per device iommu check · cca08469

Tvrtko Ursulin authored 3 years ago

With both integrated and discrete Intel GPUs in a system, the current
global check of intel_iommu_gfx_mapped, as done from intel_vtd_active()
may not be completely accurate.

In this patch we add i915 parameter to intel_vtd_active() in order to
prepare it for multiple GPUs and we also change the check away from Intel
specific intel_iommu_gfx_mapped (global exported by the Intel IOMMU
driver) to probing the presence of IOMMU on a specific device using
device_iommu_mapped().

This will return true both for IOMMU pass-through and address translation
modes which matches the current behaviour. If in the future we wanted to
distinguish between these two modes we could either use
iommu_get_domain_for_dev() and check for __IOMMU_DOMAIN_PAGING bit
indicating address translation, or ask for a new API to be exported from
the IOMMU core code.

v2:
  * Check for dmar translation specifically, not just iommu domain. (Baolu)

v3:
 * Go back to plain "any domain" check for now, rewrite commit message.

v4:
 * Use device_iommu_mapped. (Robin, Baolu)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Lu Baolu <baolu.lu@linux.intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Acked-by: Robin Murphy <robin.murphy@arm.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211126141424.493753-1-tvrtko.ursulin@linux.intel.com

cca08469

15 Nov, 2021 2 commits

agp/intel-gtt: reduce intel-gtt dependencies more · 7e78153a

Jani Nikula authored 3 years ago

Don't include stuff on behalf of users if they're not strictly necessary
for the header.

Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/7bcaa1684587b9b008d3c41468fb40e63c54fbc7.1636977089.git.jani.nikula@intel.com

7e78153a

drm/i915: include intel-gtt.h only where needed · dd54575a

Jani Nikula authored 3 years ago

Only intel_gt.c and intel_ggtt.c need the interface.

Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/034f57db24d6936ac2e4e6830261d791240cdd79.1636977089.git.jani.nikula@intel.com

dd54575a

09 Nov, 2021 1 commit

drm/i915/adlp/fb: Prevent the mapping of redundant trailing padding NULL pages · ade4a1fc

Imre Deak authored 3 years ago

So far the remapped view size in GTT/DPT was padded to the next aligned
offset unnecessarily after the last color plane with an unaligned size.
Remove the unnecessary padding.

Cc: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Fixes: 3d1adc3d ("drm/i915/adlp: Add support for remapping CCS FBs")
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211026225105.2783797-3-imre.deak@intel.com
(cherry picked from commit 6b6636e1)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

ade4a1fc

03 Nov, 2021 1 commit

drm/i915: Factor out i915_ggtt_suspend_vm/i915_ggtt_resume_vm() · 8d2f683f

Imre Deak authored 3 years ago

Factor out functions that are needed by the next patch to suspend/resume
the memory mappings for DPT FBs.

No functional change, except reordering during suspend the
ggtt->invalidate(ggtt) call wrt. atomic_set(&ggtt->vm.open, open) and
mutex_unlock(&ggtt->vm.mutex). This shouldn't matter due to the i915
suspend sequence being single threaded.

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211101183551.3580546-1-imre.deak@intel.com

8d2f683f

02 Nov, 2021 3 commits

drm/i915/adlp/fb: Fix remapping of linear CCS AUX surfaces · 96837e8b

Imre Deak authored 3 years ago

During remapping CCS FBs the CCS AUX surface mapped size and offset->x,y
coordinate calculations assumed a tiled layout. This works as long as
the CCS surface height is aligned to 64 lines (ensuring a 4k bytes CCS
surface tile layout).  However this alignment is not required by the HW
(and the driver doesn't enforces it either).

Add the remapping logic required to remap the pages of CCS surfaces
without the above alignment, assuming the natural linear layout of the
CCS surface (vs. tiled main surface layout).

Cc: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Fixes: 3d1adc3d ("drm/i915/adlp: Add support for remapping CCS FBs")
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211026225105.2783797-5-imre.deak@intel.com

96837e8b

drm/i915/fb: Factor out functions to remap contiguous FB obj pages · dd5ba4ff

Imre Deak authored 3 years ago

Factor out functions needed to map contiguous FB obj pages to a GTT/DPT
VMA view in the next patch.

While at it s/4096/I915_GTT_PAGE_SIZE/ in add_padding_pages().

No functional changes.

v2: s/4096/I915_GTT_PAGE_SIZE/ (Matthew)
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211026225105.2783797-4-imre.deak@intel.com

dd5ba4ff

drm/i915/adlp/fb: Prevent the mapping of redundant trailing padding NULL pages · 6b6636e1

Imre Deak authored 3 years ago

So far the remapped view size in GTT/DPT was padded to the next aligned
offset unnecessarily after the last color plane with an unaligned size.
Remove the unnecessary padding.

Cc: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Fixes: 3d1adc3d ("drm/i915/adlp: Add support for remapping CCS FBs")
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211026225105.2783797-3-imre.deak@intel.com

6b6636e1

01 Oct, 2021 1 commit

drm/i915: Use fixed offset for PTEs location · 9eddd5a9

Michal Wajdeczko authored 3 years ago

We assumed that for all modern GENs the PTEs and register space are
split in the GTTMMADR BAR, but while it is true, we should rather use
fixed offset as it is defined in the specification.

Bspec: 4409, 4457, 4604, 11181, 9027, 13246, 13321, 44980
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: CQ Tang <cq.tang@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210926201005.1450-1-michal.wajdeczko@intel.com

9eddd5a9

24 Sep, 2021 1 commit

drm/i915: Reduce the number of objects subject to memcpy recover · a259cc14

Thomas Hellström authored 3 years ago

We really only need memcpy restore for objects that affect the
operability of the migrate context. That is, primarily the page-table
objects of the migrate VM.

Add an object flag, I915_BO_ALLOC_PM_EARLY for objects that need early
restores using memcpy and a way to assign LMEM page-table object flags
to be used by the vms.

Restore objects without this flag with the gpu blitter and only objects
carrying the flag using TTM memcpy.

Initially mark the migrate, gt, gtt and vgpu vms to use this flag, and
defer for a later audit which vms actually need it. Most importantly, user-
allocated vms with pinned page-table objects can be restored using the
blitter.

Performance-wise memcpy restore is probably as fast as gpu restore if not
faster, but using gpu restore will help tackling future restrictions in
mappable LMEM size.

v4:
- Don't mark the aliasing ppgtt page table flags for early resume, but
  rather the ggtt page table flags as intended. (Matthew Auld)
- The check for user buffer objects during early resume is pointless, since
  they are never marked I915_BO_ALLOC_PM_EARLY. (Matthew Auld)
v5:
- Mark GuC LMEM objects with I915_BO_ALLOC_PM_EARLY to have them restored
  before we fire up the migrate context.

Cc: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210922062527.865433-8-thomas.hellstrom@linux.intel.com

a259cc14

23 Sep, 2021 1 commit

drm/i915/adlp: Add support for remapping CCS FBs · 3d1adc3d

Imre Deak authored 3 years ago

Add support for remapping CCS FBs on ADL-P to remove the restriction
of the power-of-two sized stride and the 2MB surface offset alignment
for these FBs.

We can only remap the tiles on the main surface, not the tiles on the
CCS surface, so userspace has to generate the CCS surface aligning to
the POT size padded main surface stride (by programming the AUX
pagetable accordingly). For the required AUX pagetable setup, this
requires that either the main surface stride is 8 tiles or that the
stride is 16 tiles aligned (= 64 kbytes, the area mapped by one AUX
PTE).

v2:
- Init intel_remapped_info::plane_alignment only for remapped views and
  do this from intel_fb_view_init().

Cc: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210906182715.3915100-6-imre.deak@intel.com

3d1adc3d

06 Sep, 2021 1 commit

drm/i915: Stop rcu support for i915_address_space · dcc5d820

Daniel Vetter authored 3 years ago

The full audit is quite a bit of work:

- i915_dpt has very simple lifetime (somehow we create a display pagetable vm
  per object, so its _very_ simple, there's only ever a single vma in there),
  and uses i915_vm_close(), which internally does a i915_vm_put(). No rcu.

  Aside: wtf is i915_dpt doing in the intel_display.c garbage collector as a new
  feature, instead of added as a separate file with some clean-ish interface.

  Also, i915_dpt unfortunately re-introduces some coding patterns from
  pre-dma_resv_lock conversion times.

- i915_gem_proto_ctx is fully refcounted and no rcu, all protected by
  fpriv->proto_context_lock.

- i915_gem_context is itself rcu protected, and that might leak to anything it
  points at. Before

	commit cf977e18
	Author: Chris Wilson <chris@chris-wilson.co.uk>
	Date:   Wed Dec 2 11:21:40 2020 +0000

	    drm/i915/gem: Spring clean debugfs

  and

	commit db80a129
	Author: Chris Wilson <chris@chris-wilson.co.uk>
	Date:   Mon Jan 18 11:08:54 2021 +0000

	    drm/i915/gem: Remove per-client stats from debugfs/i915_gem_objects

  we had a bunch of debugfs files that relied on rcu protecting everything, but
  those are gone now. The main one was removed even earlier with

  There doesn't seem to be anything left that's actually protecting
  stuff now that the ctx->vm itself is invariant. See

	commit ccbc1b97
	Author: Jason Ekstrand <jason@jlekstrand.net>
	Date:   Thu Jul 8 10:48:30 2021 -0500

	    drm/i915/gem: Don't allow changing the VM on running contexts (v4)

  Note that we drop the vm refcount before the final release of the gem context
  refcount, so this is all very dangerous even without rcu. Note that aside from
  later on creating new engines (a defunct feature) and debug output we're never
  looked at gem_ctx->vm for anything functional, hence why this is ok.
  Fingers crossed.

  Preceeding patches removed all vestiges of rcu use from gem_ctx->vm
  derferencing to make it clear it's really not used.

  The gem_ctx->rcu protection was introduced in

	commit a4e7ccda
	Author: Chris Wilson <chris@chris-wilson.co.uk>
	Date:   Fri Oct 4 14:40:09 2019 +0100

	    drm/i915: Move context management under GEM

  The commit message is somewhat entertaining because it fails to
  mention this fact completely, and compensates that by an in-commit
  changelog entry that claims that ctx->vm is protected by ctx->mutex.
  Which was the case _before_ this commit, but no longer after it.

- intel_context holds a full reference. Unfortunately intel_context is also rcu
  protected and the reference to the ->vm is dropped before the
  rcu barrier - only the kfree is delayed. So again we need to check
  whether that leaks anywhere on the intel_context->vm. RCU is only
  used to protect intel_context sitting on the breadcrumb lists, which
  don't look at the vm anywhere, so we are fine.

  Nothing else relies on rcu protection of intel_context and hence is
  fully protected by the kref refcount alone, which protects
  intel_context->vm in turn.

  The breadcrumbs rcu usage was added in

	commit c744d503
	Author: Chris Wilson <chris@chris-wilson.co.uk>
	Date:   Thu Nov 26 14:04:06 2020 +0000

	    drm/i915/gt: Split the breadcrumb spinlock between global and contexts

  its parent commit added the intel_context rcu protection:

	commit 14d1eaf0
	Author: Chris Wilson <chris@chris-wilson.co.uk>
	Date:   Thu Nov 26 14:04:05 2020 +0000

	    drm/i915/gt: Protect context lifetime with RCU

  given some credence to my claim that I've actually caught them all.

- drm_i915_gem_object's shares_resv_from pointer has a full refcount to the
  dma_resv, which is a sub-refcount that's released after the final
  i915_vm_put() has been called. Safe.

  Aside: Maybe we should have a struct dma_resv_shared which is just dma_resv +
  kref as a stand-alone thing. It's a pretty useful pattern which other drivers
  might want to copy.

  For a bit more context see

	commit 4d8151ae
	Author: Thomas Hellström <thomas.hellstrom@linux.intel.com>
	Date:   Tue Jun 1 09:46:41 2021 +0200

	    drm/i915: Don't free shared locks while shared

- the fpriv->vm_xa was relying on rcu_read_lock for lookup, but that
  was updated in a prep patch too to just be a spinlock-protected
  lookup.

- intel_gt->vm is set at driver load in intel_gt_init() and released
  in intel_gt_driver_release(). There seems to be some issue that
  in some error paths this is called twice, but otherwise no rcu to be
  found anywhere. This was added in the below commit, which
  unfortunately doesn't explain why this complication exists.

	commit e6ba7648
	Author: Chris Wilson <chris@chris-wilson.co.uk>
	Date:   Sat Dec 21 16:03:24 2019 +0000

	    drm/i915: Remove i915->kernel_context

  The proper fix most likely for this is to start using drmm_ at large
  scale, but that's also huge amounts of work.

- i915_vma->vm is some real pain, because rcu is rcu protected, at
  least in the vma lookup in the context lookup cache in
  eb_lookup_vma(). This was added in

	commit 4ff4b44c
	Author: Chris Wilson <chris@chris-wilson.co.uk>
	Date:   Fri Jun 16 15:05:16 2017 +0100

	    drm/i915: Store a direct lookup from object handle to vma

  This was changed to a radix tree from the hashtable in, but with the
  locking unchanged, in

	commit d1b48c1e
	Author: Chris Wilson <chris@chris-wilson.co.uk>
	Date:   Wed Aug 16 09:52:08 2017 +0100

	    drm/i915: Replace execbuf vma ht with an idr

  In

	commit 93159e12
	Author: Chris Wilson <chris@chris-wilson.co.uk>
	Date:   Mon Mar 23 09:28:41 2020 +0000

	    drm/i915/gem: Avoid gem_context->mutex for simple vma lookup

  the locking was changed from dev->struct_mutex to rcu, which added
  the requirement to rcu protect i915_vma. Somehow this was missed in
  review (or I'm completely blind).

  Irrespective of all that the vma lookup cache rcu_read_lock grabs a
  full reference of the vma and the rcu doesn't leak further. So no
  impact on i915_address_space from that.

  I have not found any other rcu use for i915_vma, but given that it
  seems broken I also didn't bother to do a careful in-depth audit.

Alltogether there's nothing left in-tree anymore which requires that a
pointer deref to an i915_address_space is safe undre rcu_read_lock
only.

rcu protection of i915_address_space was introduced in

commit b32fa811
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Thu Jun 20 19:37:05 2019 +0100

    drm/i915/gtt: Defer address space cleanup to an RCU worker

by mixing up a bugfixing (i915_address_space needs to be released from
a worker) with enabling rcu support. The commit message also seems
somewhat confused, because it talks about cleanup of WC pages
requiring sleep, while the code and linked bugzilla are about a
requirement to take dev->struct_mutex (which yes sleeps but it's a
much more specific problem). Since final kref_put can be called from
pretty much anywhere (including hardirq context through the
scheduler's i915_active cleanup) we need a worker here. Hence that
part must be kept.

Ideally all these reclaim workers should have some kind of integration
with our shrinkers, but for some of these it's rather tricky. Anyway,
that's a preexisting condition in the codeebase that we wont fix in
this patch here.

We also remove the rcu_barrier in ggtt_cleanup_hw added in

commit 60a4233a
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Mon Jul 29 14:24:12 2019 +0100

    drm/i915: Flush the i915_vm_release before ggtt shutdown
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Jason Ekstrand <jason@jlekstrand.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20210902142057.929669-11-daniel.vetter@ffwll.ch

dcc5d820

29 Jul, 2021 1 commit

drm/i915/gt: remove GRAPHICS_VER == 10 · 6266992c

Lucas De Marchi authored 3 years ago

Replace all remaining handling of GRAPHICS_VER {==,>=} 10 with
{==,>=} 11. With the removal of CNL, there is no platform with graphics
version equals 10.
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210728220326.1578242-5-lucas.demarchi@intel.com

6266992c

16 Jul, 2021 1 commit

drm/i915: Remove allow_alloc from i915_gem_object_get_sg* · 7d6a276e

Jason Ekstrand authored 3 years ago

This reverts the rest of 0edbb9ba ("drm/i915: Move cmd parser
pinning to execbuffer"). Now that the only user of i915_gem_object_get_sg
without allow_alloc has been removed, we can drop the parameter. This
portion of the revert was broken into its own patch to aid review.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Jon Bloomfield <jon.bloomfield@intel.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210714193419.1459723-4-jason@jlekstrand.net

7d6a276e

05 Jun, 2021 1 commit

drm/i915/gt: replace IS_GEN and friends with GRAPHICS_VER · c816723b

Lucas De Marchi authored 3 years ago

This was done by the following semantic patch:

	@@ expression i915; @@
	- INTEL_GEN(i915)
	+ GRAPHICS_VER(i915)

	@@ expression i915; expression E; @@
	- INTEL_GEN(i915) >= E
	+ GRAPHICS_VER(i915) >= E

	@@ expression dev_priv; expression E; @@
	- !IS_GEN(dev_priv, E)
	+ GRAPHICS_VER(dev_priv) != E

	@@ expression dev_priv; expression E; @@
	- IS_GEN(dev_priv, E)
	+ GRAPHICS_VER(dev_priv) == E

	@@
	expression dev_priv;
	expression from, until;
	@@
	- IS_GEN_RANGE(dev_priv, from, until)
	+ IS_GRAPHICS_VER(dev_priv, from, until)

	@def@
	expression E;
	identifier id =~ "^gen$";
	@@
	- id = GRAPHICS_VER(E)
	+ ver = GRAPHICS_VER(E)

	@@
	identifier def.id;
	@@
	- id
	+ ver

It also takes care of renaming the variable we assign to GRAPHICS_VER()
so to use "ver" rather than "gen".
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210605155356.4183026-2-lucas.demarchi@intel.com

c816723b

01 Jun, 2021 1 commit

drm/i915: Don't free shared locks while shared · 4d8151ae

Thomas Hellström authored 3 years ago

We are currently sharing the VM reservation locks across a number of
gem objects with page-table memory. Since TTM will individiualize the
reservation locks when freeing objects, including accessing the shared
locks, make sure that the shared locks are not freed until that is done.
For PPGTT we add an additional refcount, for GGTT we take additional
measures to make sure objects sharing the GGTT reservation lock are
freed at GGTT takedown
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210601074654.3103-3-thomas.hellstrom@linux.intel.com

4d8151ae

07 May, 2021 1 commit

drm/i915/xelpd: First stab at DPT support · 33e7a975

Ville Syrjälä authored 3 years ago

Add support for DPT (display page table). DPT is a
slightly peculiar two level page table scheme used for
tiled scanout buffers (linear uses direct ggtt mapping
still). The plane surface address will point at a page
in the DPT which holds the PTEs for 512 actual pages.
Thus we require 1/512 of the ggttt address space
compared to a direct ggtt mapping.

We create a new DPT address space for each framebuffer and
track two vmas (one for the DPT, another for the ggtt).

TODO:
- Is the i915_address_space approaach sane?
- Maybe don't map the whole DPT to write the PTEs?
- Deal with remapping/rotation? Need to create a
  separate DPT for each remapped/rotated plane I
  guess. Or else we'd need to make the per-fb DPT
  large enough to support potentially several
  remapped/rotated vmas. How large should that be?
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Bommu Krishnaiah <krishnaiah.bommu@intel.com>
Cc: Wilson Chris P <Chris.P.Wilson@intel.com>
Cc: Tang CQ <cq.tang@intel.com>
Cc: Auld Matthew <matthew.auld@intel.com>
Reviewed-by: Uma Shankar <uma.shankar@intel.com>
Reviewed-by: Wilson Chris P <Chris.P.Wilson@intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210506161930.309688-5-imre.deak@intel.com

33e7a975

29 Apr, 2021 1 commit

drm/i915: Use trylock in shrinker for ggtt on bsw vt-d and bxt, v2. · bc6f80cc

Maarten Lankhorst authored 3 years ago

The stop_machine() lock may allocate memory, but is called inside
vm->mutex, which is taken in the shrinker. This will cause a lockdep
splat, as can be seen below:

<4>[  462.585762] ======================================================
<4>[  462.585768] WARNING: possible circular locking dependency detected
<4>[  462.585773] 5.12.0-rc5-CI-Trybot_7644+ #1 Tainted: G     U
<4>[  462.585779] ------------------------------------------------------
<4>[  462.585783] i915_selftest/5540 is trying to acquire lock:
<4>[  462.585788] ffffffff826440b0 (cpu_hotplug_lock){++++}-{0:0}, at: stop_machine+0x12/0x30
<4>[  462.585814]
                  but task is already holding lock:
<4>[  462.585818] ffff888125369c70 (&vm->mutex/1){+.+.}-{3:3}, at: i915_vma_pin_ww+0x38e/0xb40 [i915]
<4>[  462.586301]
                  which lock already depends on the new lock.

<4>[  462.586305]
                  the existing dependency chain (in reverse order) is:
<4>[  462.586309]
                  -> #2 (&vm->mutex/1){+.+.}-{3:3}:
<4>[  462.586323]        i915_gem_shrinker_taints_mutex+0x2d/0x50 [i915]
<4>[  462.586719]        i915_address_space_init+0x12d/0x130 [i915]
<4>[  462.587092]        ppgtt_init+0x4e/0x80 [i915]
<4>[  462.587467]        gen8_ppgtt_create+0x3e/0x5c0 [i915]
<4>[  462.587828]        i915_ppgtt_create+0x28/0xf0 [i915]
<4>[  462.588203]        intel_gt_init+0x123/0x370 [i915]
<4>[  462.588572]        i915_gem_init+0x129/0x1f0 [i915]
<4>[  462.588971]        i915_driver_probe+0x753/0xd80 [i915]
<4>[  462.589320]        i915_pci_probe+0x43/0x1d0 [i915]
<4>[  462.589671]        pci_device_probe+0x9e/0x110
<4>[  462.589680]        really_probe+0xea/0x410
<4>[  462.589690]        driver_probe_device+0xd9/0x140
<4>[  462.589697]        device_driver_attach+0x4a/0x50
<4>[  462.589704]        __driver_attach+0x83/0x140
<4>[  462.589711]        bus_for_each_dev+0x75/0xc0
<4>[  462.589718]        bus_add_driver+0x14b/0x1f0
<4>[  462.589724]        driver_register+0x66/0xb0
<4>[  462.589731]        i915_init+0x70/0x87 [i915]
<4>[  462.590053]        do_one_initcall+0x56/0x2e0
<4>[  462.590061]        do_init_module+0x55/0x200
<4>[  462.590068]        load_module+0x2703/0x2990
<4>[  462.590074]        __do_sys_finit_module+0xad/0x110
<4>[  462.590080]        do_syscall_64+0x33/0x80
<4>[  462.590089]        entry_SYSCALL_64_after_hwframe+0x44/0xae
<4>[  462.590096]
                  -> #1 (fs_reclaim){+.+.}-{0:0}:
<4>[  462.590109]        fs_reclaim_acquire+0x9f/0xd0
<4>[  462.590118]        kmem_cache_alloc_trace+0x3d/0x430
<4>[  462.590126]        intel_cpuc_prepare+0x3b/0x1b0
<4>[  462.590133]        cpuhp_invoke_callback+0x9e/0x890
<4>[  462.590141]        _cpu_up+0xa4/0x130
<4>[  462.590147]        cpu_up+0x82/0x90
<4>[  462.590153]        bringup_nonboot_cpus+0x4a/0x60
<4>[  462.590159]        smp_init+0x21/0x5c
<4>[  462.590167]        kernel_init_freeable+0x8a/0x1b7
<4>[  462.590175]        kernel_init+0x5/0xff
<4>[  462.590181]        ret_from_fork+0x22/0x30
<4>[  462.590187]
                  -> #0 (cpu_hotplug_lock){++++}-{0:0}:
<4>[  462.590199]        __lock_acquire+0x1520/0x2590
<4>[  462.590207]        lock_acquire+0xd1/0x3d0
<4>[  462.590213]        cpus_read_lock+0x39/0xc0
<4>[  462.590219]        stop_machine+0x12/0x30
<4>[  462.590226]        bxt_vtd_ggtt_insert_entries__BKL+0x36/0x50 [i915]
<4>[  462.590601]        ggtt_bind_vma+0x5d/0x80 [i915]
<4>[  462.590970]        i915_vma_bind+0xdc/0x1c0 [i915]
<4>[  462.591374]        i915_vma_pin_ww+0x435/0xb40 [i915]
<4>[  462.591779]        make_obj_busy+0xcb/0x330 [i915]
<4>[  462.592170]        igt_mmap_offset_exhaustion+0x45f/0x4c0 [i915]
<4>[  462.592562]        __i915_subtests.cold.7+0x42/0x92 [i915]
<4>[  462.592995]        __run_selftests.part.3+0x10d/0x172 [i915]
<4>[  462.593428]        i915_live_selftests.cold.5+0x1f/0x47 [i915]
<4>[  462.593860]        i915_pci_probe+0x93/0x1d0 [i915]
<4>[  462.594210]        pci_device_probe+0x9e/0x110
<4>[  462.594217]        really_probe+0xea/0x410
<4>[  462.594226]        driver_probe_device+0xd9/0x140
<4>[  462.594233]        device_driver_attach+0x4a/0x50
<4>[  462.594240]        __driver_attach+0x83/0x140
<4>[  462.594247]        bus_for_each_dev+0x75/0xc0
<4>[  462.594254]        bus_add_driver+0x14b/0x1f0
<4>[  462.594260]        driver_register+0x66/0xb0
<4>[  462.594267]        i915_init+0x70/0x87 [i915]
<4>[  462.594586]        do_one_initcall+0x56/0x2e0
<4>[  462.594592]        do_init_module+0x55/0x200
<4>[  462.594599]        load_module+0x2703/0x2990
<4>[  462.594605]        __do_sys_finit_module+0xad/0x110
<4>[  462.594612]        do_syscall_64+0x33/0x80
<4>[  462.594618]        entry_SYSCALL_64_after_hwframe+0x44/0xae
<4>[  462.594625]
                  other info that might help us debug this:

<4>[  462.594629] Chain exists of:
                    cpu_hotplug_lock --> fs_reclaim --> &vm->mutex/1

<4>[  462.594645]  Possible unsafe locking scenario:

<4>[  462.594648]        CPU0                    CPU1
<4>[  462.594652]        ----                    ----
<4>[  462.594655]   lock(&vm->mutex/1);
<4>[  462.594664]                                lock(fs_reclaim);
<4>[  462.594671]                                lock(&vm->mutex/1);
<4>[  462.594679]   lock(cpu_hotplug_lock);
<4>[  462.594686]
                   *** DEADLOCK ***

<4>[  462.594690] 4 locks held by i915_selftest/5540:
<4>[  462.594696]  #0: ffff888100fbc240 (&dev->mutex){....}-{3:3}, at: device_driver_attach+0x18/0x50
<4>[  462.594715]  #1: ffffc900006cb9a0 (reservation_ww_class_acquire){+.+.}-{0:0}, at: make_obj_busy+0x81/0x330 [i915]
<4>[  462.595118]  #2: ffff88812a6081e8 (reservation_ww_class_mutex){+.+.}-{3:3}, at: make_obj_busy+0x21f/0x330 [i915]
<4>[  462.595519]  #3: ffff888125369c70 (&vm->mutex/1){+.+.}-{3:3}, at: i915_vma_pin_ww+0x38e/0xb40 [i915]
<4>[  462.595934]
                  stack backtrace:
<4>[  462.595939] CPU: 0 PID: 5540 Comm: i915_selftest Tainted: G     U            5.12.0-rc5-CI-Trybot_7644+ #1
<4>[  462.595947] Hardware name: GOOGLE Kefka/Kefka, BIOS MrChromebox 02/04/2018
<4>[  462.595952] Call Trace:
<4>[  462.595961]  dump_stack+0x7f/0xad
<4>[  462.595974]  check_noncircular+0x12e/0x150
<4>[  462.595982]  ? save_stack.isra.17+0x3f/0x70
<4>[  462.595991]  ? drm_mm_insert_node_in_range+0x34a/0x5b0
<4>[  462.596000]  ? i915_vma_pin_ww+0x9ec/0xb40 [i915]
<4>[  462.596410]  __lock_acquire+0x1520/0x2590
<4>[  462.596419]  ? do_init_module+0x55/0x200
<4>[  462.596429]  lock_acquire+0xd1/0x3d0
<4>[  462.596435]  ? stop_machine+0x12/0x30
<4>[  462.596445]  ? gen8_ggtt_insert_entries+0xf0/0xf0 [i915]
<4>[  462.596816]  cpus_read_lock+0x39/0xc0
<4>[  462.596824]  ? stop_machine+0x12/0x30
<4>[  462.596831]  stop_machine+0x12/0x30
<4>[  462.596839]  bxt_vtd_ggtt_insert_entries__BKL+0x36/0x50 [i915]
<4>[  462.597210]  ggtt_bind_vma+0x5d/0x80 [i915]
<4>[  462.597580]  i915_vma_bind+0xdc/0x1c0 [i915]
<4>[  462.597986]  i915_vma_pin_ww+0x435/0xb40 [i915]
<4>[  462.598395]  ? make_obj_busy+0xcb/0x330 [i915]
<4>[  462.598786]  make_obj_busy+0xcb/0x330 [i915]
<4>[  462.599180]  ? 0xffffffff81000000
<4>[  462.599187]  ? debug_mutex_unlock+0x50/0xa0
<4>[  462.599198]  igt_mmap_offset_exhaustion+0x45f/0x4c0 [i915]
<4>[  462.599592]  __i915_subtests.cold.7+0x42/0x92 [i915]
<4>[  462.600026]  ? i915_perf_selftests+0x20/0x20 [i915]
<4>[  462.600422]  ? __i915_nop_setup+0x10/0x10 [i915]
<4>[  462.600820]  __run_selftests.part.3+0x10d/0x172 [i915]
<4>[  462.601253]  i915_live_selftests.cold.5+0x1f/0x47 [i915]
<4>[  462.601686]  i915_pci_probe+0x93/0x1d0 [i915]
<4>[  462.602037]  ? _raw_spin_unlock_irqrestore+0x3d/0x60
<4>[  462.602047]  pci_device_probe+0x9e/0x110
<4>[  462.602057]  really_probe+0xea/0x410
<4>[  462.602067]  driver_probe_device+0xd9/0x140
<4>[  462.602075]  device_driver_attach+0x4a/0x50
<4>[  462.602084]  __driver_attach+0x83/0x140
<4>[  462.602091]  ? device_driver_attach+0x50/0x50
<4>[  462.602099]  ? device_driver_attach+0x50/0x50
<4>[  462.602107]  bus_for_each_dev+0x75/0xc0
<4>[  462.602116]  bus_add_driver+0x14b/0x1f0
<4>[  462.602124]  driver_register+0x66/0xb0
<4>[  462.602133]  i915_init+0x70/0x87 [i915]
<4>[  462.602453]  ? 0xffffffffa0606000
<4>[  462.602458]  do_one_initcall+0x56/0x2e0
<4>[  462.602466]  ? kmem_cache_alloc_trace+0x374/0x430
<4>[  462.602476]  do_init_module+0x55/0x200
<4>[  462.602484]  load_module+0x2703/0x2990
<4>[  462.602500]  ? __do_sys_finit_module+0xad/0x110
<4>[  462.602507]  __do_sys_finit_module+0xad/0x110
<4>[  462.602519]  do_syscall_64+0x33/0x80
<4>[  462.602527]  entry_SYSCALL_64_after_hwframe+0x44/0xae
<4>[  462.602535] RIP: 0033:0x7fab69d8d89d

Changes since v1:
- Add lockdep annotations during init, to ensure that lockdep is primed.
  This also fixes a false positive when reading /proc/lockdep_stats
  during module reload.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210426102351.921874-1-maarten.lankhorst@linux.intel.comReviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>

bc6f80cc

27 Apr, 2021 1 commit

drm/i915/gtt: map the PD up front · 529b9ec8

Matthew Auld authored 3 years ago

We need to generalise our accessor for the page directories and tables from
using the simple kmap_atomic to support local memory, and this setup
must be done on acquisition of the backing storage prior to entering
fence execution contexts. Here we replace the kmap with the object
mapping code that for simple single page shmemfs object will return a
plain kmap, that is then kept for the lifetime of the page directory.

Note that keeping the mapping around is a potential concern here, since
while the vma is pinned the mapping remains there for the PDs
underneath, or at least until the used_count reaches zero, at which
point we can safely destroy the mapping. For 32b this will be even worse
since the address space is more limited, but since this change mostly
impacts full ppGTT platforms, the justification is that for modern
platforms we shouldn't care too much about 32b.
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210427085417.120246-3-matthew.auld@intel.com

529b9ec8

29 Mar, 2021 2 commits

drm/i915: Add support for FBs requiring a POT stride alignment · a4606d45

Imre Deak authored 3 years ago

An upcoming platform has a restriction that the FB stride must be
power-of-two aligned. To support framebuffer layouts that are not in
this layout add a logic that pads the tile rows to the POT aligned size.

The HW won't read the padding PTEs, so these don't have to point to an
allocated address, or even have their valid flag set. So use a NULL PTE
instead for instance the scratch page, which is simple and keeps the SG
table compact.

v2:
- Simplify plane_view_dst_stride(). (Ville)
- Pass pitch_tiles as unsigned int.
v3:
- Drop unintentional s/plane_state->rotation/plane_config->rotation/
  change.
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210325214808.2071517-24-imre.deak@intel.com

a4606d45

drm/i915: s/stride/src_stride/ in the intel_remapped_plane_info struct · 6d80f430

Imre Deak authored 3 years ago

An upcoming patch adds a new dst_stride field to the
intel_remapped_plane_info struct, so for clarity rename the current
stride field to src_stride.
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210325214808.2071517-23-imre.deak@intel.com

6d80f430

24 Mar, 2021 3 commits

drm/i915/gtt/dg1: add PTE_LM plumbing for GGTT · e762bdf5

Matthew Auld authored 4 years ago

For the PTEs we get an LM bit, to signal whether the page resides in
SMEM or LMEM.

Based on a patch from Michel Thierry.

BSpec: 45015
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20210203171231.551338-3-matthew.auld@intel.comSigned-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

e762bdf5

drm/i915/gt: Remove repeated words from comments · 1ca9b8da

Chris Wilson authored 4 years ago

Checkpatch spotted a few repeated words in the comment, genuine
mistakes.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210122192913.4518-3-chris@chris-wilson.co.ukSigned-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>

1ca9b8da

drm/i915: Use a single page table lock for each gtt. · 26ad4f8b

Maarten Lankhorst authored 3 years ago

We may create page table objects on the fly, but we may need to
wait with the ww lock held. Instead of waiting on a freed obj
lock, ensure we have the same lock for each object to keep
-EDEADLK working. This ensures that i915_vma_pin_ww can lock
the page tables when required.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210323155059.628690-41-maarten.lankhorst@linux.intel.com

26ad4f8b