Commits · 342206b7cc064b8b004474c0baab2c67ced646d0 · Kirill Smelkov / linux

21 Dec, 2023 40 commits

drm/xe: Always use xe_vm_queue_rebind_worker helper · 342206b7

Matthew Brost authored Jul 26, 2023

Do not queue the rebind worker directly, rather use the helper
xe_vm_queue_rebind_worker. This ensures we use the correct work queue.
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

342206b7

drm/xe: Invert guc vs execlists parameters and info. · c8dc1546

Rodrigo Vivi authored Jul 21, 2023

The module parameter should reflect the name of the optional,
experimental and unsafe option, rather than the default one.
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>

c8dc1546

drm/xe/uapi: Remove XE_QUERY_CONFIG_FLAGS_USE_GUC · c856cc13

Rodrigo Vivi authored Jul 21, 2023

This config is the only real one. If execlist remains in the
code it will forever be experimental and we shouldn't maintain
an uapi like that for that experimental piece of code that
should never be used by real users.
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>

c856cc13

drm/xe: fully turn on small-bar support · c00ce7f2

Matthew Auld authored Mar 31, 2023

This allows vram_size > io_size, instead of just clamping the vram size
to the BAR size, now that the driver supports it.
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Michael J. Ruhl <michael.j.ruhl@intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

c00ce7f2

drm/xe/uapi: add the userspace bits for small-bar · cd928fce

Matthew Auld authored Mar 31, 2023

Mostly the same as i915. We add a new hint for userspace to force an
object into the mappable part of vram.

We also need to tell userspace how large the mappable part is. In Vulkan
for example, there will be two vram heaps for small-bar systems. And
here the size of each heap needs to be known. Likewise the used/avail
tracking needs to account for the mappable part.

We also limit the available tracking going forward, such that we limit
to privileged users only, since these values are system wide and are
technically considered an info leak.

v2 (Maarten):
  - s/NEEDS_CPU_ACCESS/NEEDS_VISIBLE_VRAM/ in the uapi. We also no
    longer require smem as an extra placement. This is more flexible,
    and lets us use this for clear-color surfaces, since we need CPU access
    there but we don't want to attach smem, since that effectively disables
    CCS from kernel pov.
  - Reject clear-color CCS buffers where NEEDS_VISIBLE_VRAM is not set,
    instead of migrating it behind the scenes.
v3 (José):
  - Split the changes that limit the accounting for perfmon_capable()
    into a separate patch.
  - Use XE_BO_CREATE_VRAM_MASK.
v4 (Gwan-gyeong Mun):
  - Add some kernel-doc for the query bits.
v5:
  - One small kernel-doc correction. The cpu_visible_size and
    corresponding used tracking are always zero for non
    XE_MEM_REGION_CLASS_VRAM.
v6:
  - Without perfmon_capable() it likely makes more sense to report as
    zero, instead of reporting as used == total size. This should give
    similar behaviour as i915 which rather tracks free instead of used.
  - Only enforce NEEDS_VISIBLE_VRAM on rc_ccs_cc_plane surfaces when the
    device is actually small-bar.

Testcase: igt/tests/xe_query
Testcase: igt/tests/xe_mmap@small-bar
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: José Roberto de Souza <jose.souza@intel.com>
Cc: Filip Hazubski <filip.hazubski@intel.com>
Cc: Carl Zhang <carl.zhang@intel.com>
Cc: Effie Yu <effie.yu@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

cd928fce

drm/xe/bo: support tiered vram allocation for small-bar · 6a024f1b

Matthew Auld authored Mar 31, 2023

Add the new flag XE_BO_NEEDS_CPU_ACCESS, to force allocating in the
mappable part of vram. If no flag is specified we do a topdown
allocation, to limit the chances of stealing the precious mappable part,
if we don't need it. If this is a full-bar system, then this all gets
nooped.

For kernel users, it looks like xe_bo_create_pin_map() is the central
place which users should call if they want CPU access to the object, so
add the flag there.

We still need to plumb this through for userspace allocations. Also it
looks like page-tables are using pin_map(), which is less than ideal. If
we can already use the GPU to do page-table management, then maybe we
should just force that for small-bar.
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

6a024f1b

drm/xe: xe_engine_create_ioctl should check gt_count, not tile_count · 2a6d871b

Matt Roper authored Jul 24, 2023

Platforms like MTL only have a single tile, but multiple GTs.
Ensure XE_ENGINE_CREATE accepts engine creation on gt1 on such
platforms.
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://lore.kernel.org/r/20230725003433.1992137-4-matthew.d.roper@intel.comSigned-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

2a6d871b

drm/xe/mtl: Map PPGTT as CPU:WC · 7a060d78

Matt Roper authored Jul 24, 2023

On MTL and beyond, the GPU performs non-coherent accesses to the PPGTT
page tables.  These page tables should be mapped as CPU:WC.

Removes CAT errors triggered by xe_exec_basic@once-basic on MTL:

   xe 0000:00:02.0: [drm:__xe_pt_bind_vma [xe]] Preparing bind, with range [1a0000...1a0fff) engine 0000000000000000.
   xe 0000:00:02.0: [drm:xe_vm_dbg_print_entries [xe]] 1 entries to update
   xe 0000:00:02.0: [drm:xe_vm_dbg_print_entries [xe]]  0: Update level 3 at (0 + 1) [0...8000000000) f:0
   xe 0000:00:02.0: [drm] Engine memory cat error: guc_id=2
   xe 0000:00:02.0: [drm] Engine memory cat error: guc_id=2
   xe 0000:00:02.0: [drm] Timedout job: seqno=4294967169, guc_id=2, flags=0x4

v2:
 - Rename to XE_BO_PAGETABLE to make it more clear that this BO is the
   pagetable itself, rather than just being bound in the PPGTT.  (Lucas)

Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Acked-by: Nirmoy Das <nirmoy.das@intel.com>
Link: https://lore.kernel.org/r/20230725003433.1992137-3-matthew.d.roper@intel.comSigned-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

7a060d78

drm/xe: add lockdep annotation for xe_device_mem_access_put() · 9700a1df

Matthew Auld authored Jul 24, 2023

The main motivation is with d3cold which will make the suspend and
resume callbacks even more scary, but is useful regardless. We already
have the needed annotation on the acquire side with
xe_device_mem_access_get(), and by adding the annotation on the release
side we should have a lot more confidence that our locking hierarchy is
correct.

v2:
  - Move the annotation into both callbacks for better symmetry. Also
    don't hold over the entire mem_access_get(); we only need to lockep
    to understand what is being held upon entering mem_access_get(), and
    how that matches up with locks in the callbacks.
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Anshuman Gupta <anshuman.gupta@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

9700a1df

drm/xe: Use migrate engine for page fault binds · 7ead3315

Matthew Brost authored Jul 21, 2023

We must use migrate engine for page fault binds in order to avoid a
deadlock as the migrate engine has a reserved BCS instance which cannot
be stuck on a fault. To use the migrate engine the engine argument to
xe_migrate_update_pgtables must be NULL, this was incorrectly wired up
so vm->eng[tile_id] was always being used. Fix this.
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

7ead3315

drm/xe: Only alloc userptr part of xe_vma for userptrs · a4cc60a5

Matthew Brost authored Jul 19, 2023

Only alloc userptr part of xe_vma for userptrs, this will save on space
in the common BO case.
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

a4cc60a5

drm/xe: Combine destroy_cb and destroy_work in xe_vma into union · eae553cb

Matthew Brost authored Jul 19, 2023

The callback kicks the worker thus mutually exclusive execution,
combining saves a bit of space in xe_vma.
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

eae553cb

drm/xe: Change tile masks from u64 to u8 · 63412a5a

Matthew Brost authored Jul 19, 2023

This will save us a few bytes in the xe_vma structure.

v2: Use hweight8 rather than hweight_long (Rodrigo)
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

63412a5a

drm/xe: Replace list_del_init with list_del for userptr.invalidate_link cleanup · 3daf694c

Matthew Brost authored Jul 19, 2023

This list isn't used again, list_del is the proper call.
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

3daf694c

drm/xe: Reduce the number list links in xe_vma · 1655c893

Matthew Brost authored Jul 19, 2023

Combine the userptr, rebind, and destroy links into a union as
the lists these links belong to are mutually exclusive.

v2: Adjust which lists are combined (Thomas H)
v3: Add kernel doc why this is safe (Thomas H), remove related change
of list_del_init -> list_del (Rodrigo)
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

1655c893

drm/xe: Avoid doing rebinds · 8f33b4f0

Matthew Brost authored Jul 19, 2023

If we dont change page sizes we can avoid doing rebinds rather just do a
partial unbind. The algorithm to determine its page size is greedy as we
assume all pages in the removed VMA are the largest page used in the
VMA.

v2: Don't exceed 100 lines
v3: struct xe_vma_op_unmap remove in different patch, remove XXX comment
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

8f33b4f0

drm/xe: Remove xe_vma_op_unmap · 3188c0f4

Matthew Brost authored Jul 19, 2023

xe_vma_op_unmap isn't used, remove it.
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

3188c0f4

drm/xe: Make bind engines safe · fd84041d

Matthew Brost authored Jul 19, 2023

We currently have a race between bind engines which can result in
corrupted page tables leading to faults.

A simple example:
bind A 0x0000-0x1000, engine A, has unsatisfied in-fence
bind B 0x1000-0x2000, engine B, no in-fences
exec A uses 0x1000-0x2000

Bind B will pass bind A and exec A will fault. This occurs as bind A
programs the root of the page table in a bind job which is held up by an
in-fence. Bind B in this case just programs a leaf entry of the
structure.

To fix use range-fence utility to track cross bind engine conflicts. In
the above example bind A would insert an dependency into the range-fence
tree with a key of 0x0-0x7fffffffff, bind B would find that dependency
and its bind job would scheduled behind the unsatisfied in-fence and
bind A's job.

Reviewed-by: Maarten Lankhorst<maarten.lankhorst@linux.intel.com>
Co-developed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

fd84041d

drm/xe: Introduce a range-fence utility · 845f64bd

Thomas Hellström authored Jul 09, 2023

Add generic utility to track range conflicts signaled by a dma-fence.
Tracking implemented via an interval tree. An example use case being
tracking conflicts for pending (un)binds from multiple bind engines. By
being generic ths idea would this could moved to the DRM level and used
in multiple drivers for similar problems.

v2: Make interval tree functions static (CI)
v3: Remove non-static cleanup function (CI)
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

845f64bd

drm/xe/execlist: Log when using execlist submission · 0043a3e8

Francois Dugast authored Jul 19, 2023

Make explicit in the log that execlist submission is used to prevent from
silently using it over GuC submission.
Signed-off-by: Francois Dugast <francois.dugast@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

0043a3e8

drm/xe: Cleanup style warnings and errors · 72e8d73b

Francois Dugast authored Jul 19, 2023

Fix 6 errors and 20 warnings reported by checkpatch.pl.
Signed-off-by: Francois Dugast <francois.dugast@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

72e8d73b

drm/xe/execlist: Remove leftover printk messages · ea82d5aa

Francois Dugast authored Jul 19, 2023

Those look like leftover debug and are not even being used. If they were
real debug/info, they should be using the drm helpers.
Signed-off-by: Francois Dugast <francois.dugast@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

ea82d5aa

drm/xe: Rely on kmalloc/kzalloc log message · 955c09e2

Francois Dugast authored Jul 19, 2023

Those messages are unnecessary because a generic message is already
produced in case of allocation failure. Besides, this also removes a
misuse of the XE_IOCTL_DBG macro.
Signed-off-by: Francois Dugast <francois.dugast@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

955c09e2

drm/xe: Use FIELD_PREP/FIELD_GET for tile id encoding · 4d18eac0

Lucas De Marchi authored Jul 18, 2023

Use FIELD_PREP()/FIELD_GET() to encode the tile id into flags. Besides
protecting for eventual overflow it also makes it easier to see a new
flag can't be added as BIT(7).
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://lore.kernel.org/r/20230718193924.3084759-2-lucas.demarchi@intel.comSigned-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

4d18eac0

drm/xe: Normalize XE_VM_FLAG* names · 0d39b6da

Lucas De Marchi authored Jul 18, 2023

Rename XE_VM_FLAGS_64K to XE_VM_FLAG_64K to follow the other names and
s/GT/TILE/ that got missed in commit 08dea767 ("drm/xe: Move
migration from GT to tile").
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://lore.kernel.org/r/20230718193924.3084759-1-lucas.demarchi@intel.comSigned-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

0d39b6da

drm/xe: add missing bulk_move reset · ee82d2da

Matthew Auld authored Jul 13, 2023

It looks like bulk_move is set during object construction, but is only
removed on object close, however in various places we might not yet have
an actual fd to close, like on the error paths for the gem_create ioctl,
and also one internal user for the evict_test_run_gt() selftest. Try to
handle those cases by manually resetting the bulk_move. This should
prevent triggering:

WARNING: CPU: 7 PID: 8252 at drivers/gpu/drm/ttm/ttm_bo.c:327
ttm_bo_release+0x25e/0x2a0 [ttm]

v2 (Nirmoy):
  - It should be safe to just unconditionally call
    __xe_bo_unset_bulk_move() in most places.
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

ee82d2da

drm/xe/selftests: restart GT after xe_bo_restore_kernel() · 5a142f9c

Matthew Auld authored Jul 13, 2023

Test seems to be failing badly after calling xe_bo_restore_kernel().
Taking a snapshot of the CTB and copying back a potentially old version
seems risky, depending on what might have been inflight. Also it seems
snapshotting the ADS object and copying back results in serious
breakage. Normally when calling xe_bo_restore_kernel() we always fully
restart the GT, which re-intializes such things.  We could potentially
skip saving and restoring such objects in xe_bo_evict_all() however
seems quite fragile not to also restart the GT. Try to do that here by
triggering a GT reset.
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Acked-by: Nirmoy Das <nirmoy.das@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

5a142f9c

drm/xe/selftests: hold rpm for ccs_test_migrate() · 93990291

Matthew Auld authored Jul 12, 2023

The GPU job will keep the device awake, however assumption here is that
caller of xe_migrate_clear() is also holding mem_access.ref otherwise we
hit the asserts in xe_sa_bo_flush_write() prior to the job construction.
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

93990291

drm/xe/selftests: hold rpm for evict_test_run_device() · 6a0612ae

Matthew Auld authored Jul 12, 2023

We are calling fairly low level things like xe_bo_restore_kernel() which
expect caller to be holding mem_access.ref. Since we are doing stuff
like evict_all we likely don't want to race with rpm suspend, since that
potentially wants to do the same thing, so just wrap the whole test.
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

6a0612ae

drm/xe: add lockdep annotation for xe_device_mem_access_get() · e3d23092

Matthew Auld authored Jul 19, 2023

The atomics here might hide potential issues, also rpm core is not
holding any lock when calling our rpm resume callback, so add a dummy lock
with the idea that xe_pm_runtime_resume() is eventually going to be
called when we are holding it. This only needs to happen once and then
lockdep can validate all callers and their locks.

v2: (Thomas Hellström)
 - Prefer static lockdep_map instead of full blown mutex.
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Acked-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

e3d23092

drm/xe: drop xe_device_mem_access_get() from invalidation_vma · 7d623575

Matthew Auld authored Jul 19, 2023

Lockdep gives the following splat:

[  594.158863] ffff888140da53f0 (&vm->userptr.notifier_lock){++++}-{3:3}, at: vma_userptr_invalidate+0xeb/0x330 [xe]
[  594.158921]
               but task is already holding lock:
[  594.158926] ffffffff82761940
(mmu_notifier_invalidate_range_start){+.+.}-{0:0}, at: unmap_vmas+0x0/0x1c0
[  594.158941]
               which lock already depends on the new lock.

[  594.158947]
               the existing dependency chain (in reverse order) is:
[  594.158953]
               -> #5 (mmu_notifier_invalidate_range_start){+.+.}-{0:0}:
[  594.158961]        fs_reclaim_acquire+0x68/0xd0
[  594.158969]        __kmem_cache_alloc_node+0x2c/0x1b0
[  594.158975]        kmalloc_node_trace+0x1d/0xb0
[  594.158983]        alloc_worker+0x18/0x50
[  594.158989]        init_rescuer.part.0+0x13/0xa0
[  594.158995]        workqueue_init+0xdf/0x210
[  594.159001]        kernel_init_freeable+0x5c/0x2f0
[  594.159009]        kernel_init+0x11/0x1a0
[  594.159017]        ret_from_fork+0x29/0x50
[  594.159023]
               -> #4 (fs_reclaim){+.+.}-{0:0}:
[  594.159031]        fs_reclaim_acquire+0xa0/0xd0
[  594.159037]        __kmem_cache_alloc_node+0x2c/0x1b0
[  594.159042]        kmalloc_trace+0x20/0xb0
[  594.159048]        acpi_device_add+0x25a/0x3f0
[  594.159056]        acpi_add_single_object+0x387/0x750
[  594.159063]        acpi_bus_check_add+0x108/0x280
[  594.159069]        acpi_bus_scan+0x34/0xf0
[  594.159075]        acpi_scan_init+0xed/0x2b0
[  594.159082]        acpi_init+0x21e/0x520
[  594.159087]        do_one_initcall+0x53/0x260
[  594.159092]        kernel_init_freeable+0x18a/0x2f0
[  594.159099]        kernel_init+0x11/0x1a0
[  594.159105]        ret_from_fork+0x29/0x50
[  594.159110]
               -> #3 (acpi_device_lock){+.+.}-{3:3}:
[  594.159117]        __mutex_lock+0x95/0xd10
[  594.159122]        acpi_enable_wakeup_device_power+0x30/0x120
[  594.159130]        __acpi_device_wakeup_enable+0x34/0x110
[  594.159138]        acpi_pm_set_device_wakeup+0x55/0x140
[  594.159143]        __pci_enable_wake+0x56/0xb0
[  594.159150]        pci_finish_runtime_suspend+0x35/0x80
[  594.159157]        pci_pm_runtime_suspend+0xb5/0x1a0
[  594.159162]        __rpm_callback+0x3c/0x110
[  594.159170]        rpm_callback+0x58/0x70
[  594.159176]        rpm_suspend+0x15c/0x6f0
[  594.159182]        pm_runtime_work+0x9b/0xb0
[  594.159188]        process_one_work+0x263/0x520
[  594.159195]        worker_thread+0x4d/0x3b0
[  594.159200]        kthread+0xeb/0x120
[  594.159206]        ret_from_fork+0x29/0x50
[  594.159211]
               -> #2 (acpi_wakeup_lock){+.+.}-{3:3}:
[  594.159218]        __mutex_lock+0x95/0xd10
[  594.159223]        acpi_pm_set_device_wakeup+0x7a/0x140
[  594.159228]        __pci_enable_wake+0x77/0xb0
[  594.159234]        pci_pm_runtime_resume+0x70/0xd0
[  594.159240]        __rpm_callback+0x3c/0x110
[  594.159246]        rpm_callback+0x58/0x70
[  594.159252]        rpm_resume+0x50d/0x7a0
[  594.159258]        rpm_resume+0x267/0x7a0
[  594.159264]        __pm_runtime_resume+0x45/0x90
[  594.159270]        xe_pm_runtime_resume_and_get+0x12/0x50 [xe]
[  594.159314]        xe_device_mem_access_get+0x97/0xc0 [xe]
[  594.159346]        hw_engines+0x65/0xf0 [xe]
[  594.159380]        seq_read_iter+0x10d/0x4b0
[  594.159385]        seq_read+0x9e/0xd0
[  594.159390]        full_proxy_read+0x4e/0x80
[  594.159396]        vfs_read+0xb6/0x310
[  594.159401]        ksys_read+0x60/0xe0
[  594.159406]        do_syscall_64+0x38/0x90
[  594.159413]        entry_SYSCALL_64_after_hwframe+0x72/0xdc
[  594.159419]
               -> #1 (&xe->mem_access.lock){+.+.}-{3:3}:
[  594.159427]        xe_device_mem_access_get+0x43/0xc0 [xe]
[  594.159457]        xe_gt_tlb_invalidation_vma+0x53/0x190 [xe]
[  594.159490]        invalidation_fence_init+0x1d2/0x2c0 [xe]
[  594.159529]        __xe_pt_unbind_vma+0x151/0x4e0 [xe]
[  594.159564]        vm_bind_ioctl+0x48a/0xae0 [xe]
[  594.159602]        async_op_work_func+0x20c/0x530 [xe]
[  594.159634]        process_one_work+0x263/0x520
[  594.159640]        worker_thread+0x4d/0x3b0
[  594.159646]        kthread+0xeb/0x120
[  594.159650]        ret_from_fork+0x29/0x50
[  594.159655]
               -> #0 (&vm->userptr.notifier_lock){++++}-{3:3}:
[  594.159663]        __lock_acquire+0x16fa/0x2850
[  594.159670]        lock_acquire+0xd2/0x2e0
[  594.159676]        down_write+0x36/0xd0
[  594.159681]        vma_userptr_invalidate+0xeb/0x330 [xe]
[  594.159714]        __mmu_notifier_invalidate_range_start+0x239/0x2a0
[  594.159722]        unmap_vmas+0x1ac/0x1c0
[  594.159727]        unmap_region+0xb5/0x120
[  594.159732]        do_vmi_align_munmap+0x2be/0x430
[  594.159739]        do_vmi_munmap+0xea/0x120
[  594.159744]        __vm_munmap+0x9c/0x160
[  594.159750]        __x64_sys_munmap+0x12/0x20
[  594.159756]        do_syscall_64+0x38/0x90
[  594.159761]        entry_SYSCALL_64_after_hwframe+0x72/0xdc
[  594.159768]
               other info that might help us debug this:

[  594.159773] Chain exists of:
                 &vm->userptr.notifier_lock --> fs_reclaim -->
mmu_notifier_invalidate_range_start

[  594.159785]  Possible unsafe locking scenario:

[  594.159790]        CPU0                    CPU1
[  594.159794]        ----                    ----
[  594.159797]   lock(mmu_notifier_invalidate_range_start);
[  594.159802]                                lock(fs_reclaim);
[  594.159808]
lock(mmu_notifier_invalidate_range_start);
[  594.159814]   lock(&vm->userptr.notifier_lock);
[  594.159819]

The VM should be holding a mem_access.ref so this looks like it should
be a false positive and we can just drop the explicit mem_access in
xe_gt_tlb_invalidation().  The GGTT invalidation path also takes care to
hold mem_access.ref so should be fine there also, and we already assert
that we hold access.ref for the GuC communication underneath.
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

7d623575

drm/xe/ggtt: prime ggtt->lock against FS_RECLAIM · e018f44b

Matthew Auld authored Jul 19, 2023

Increase the sensitivity of the ggtt->lock by priming it against
FS_RECLAIM, such that allocating memory while holding will result in
lockdep splats.
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

e018f44b

drm/xe: drop xe_device_mem_access_get() from guc_ct_send · 7eed01a9

Matthew Auld authored Jul 19, 2023

The callers should already be holding the mem_access reference, before
calling into this.
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

7eed01a9

drm/xe: ensure correct access_put ordering · 03af26c9

Matthew Auld authored Jul 19, 2023

Only call access_put after dropping the forcewake. In theory the device
could suspend, but really we want to start asserting that we have a
mem_access.ref when touching mmio.
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

03af26c9

drm/xe/mmio: grab mem_access in xe_mmio_ioctl · 7da1d76f

Matthew Auld authored Jul 19, 2023

Any kind of device memory access should first ensure the device is not
suspended, mmio included.
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

7da1d76f

drm/xe/guc_pc: add missing mem_access for freq_rpe_show · 2d3ab1fa

Matthew Auld authored Jul 19, 2023

The mem_access is meant to cover any kind of device level memory access,
mmio included.
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Anshuman Gupta <anshuman.gupta@intel.com>
Reviewed-by: Anshuman Gupta <anshuman.gupta@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

2d3ab1fa

drm/xe/debugfs: grab mem_access around forcewake · 6bfbd0c5

Matthew Auld authored Jul 19, 2023

We need keep the device awake when performing any kind of mmio operation.

Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/279Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

6bfbd0c5

drm/xe/vm: tidy up xe_runtime_pm usage · 2d30332a

Matthew Auld authored Jul 19, 2023

The xe_device_mem_access_get() should be all that's needed here and
should now work as expected, without any strange races. In theory should
be no functional changes here.
Reported-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

2d30332a

drm/xe: fix xe_device_mem_access_get() races · a00b8f1a

Matthew Auld authored Jul 19, 2023

It looks like there is at least one race here, given that the
pm_runtime_suspended() check looks to return false if we are in the
process of suspending the device (RPM_SUSPENDING vs RPM_SUSPENDED).  We
later also do xe_pm_runtime_get_if_active(), but since the device is
suspending or has now suspended, this doesn't do anything either.
Following from this we can potentially return from
xe_device_mem_access_get() with the device suspended or about to be,
leading to broken behaviour.

Attempt to fix this by always grabbing the runtime ref when our internal
ref transitions from 0 -> 1. The hard part is then dealing with the
runtime_pm callbacks also calling xe_device_mem_access_get() and
deadlocking, which the pm_runtime_suspended() check prevented.

v2:
 - ct->lock looks to be primed with fs_reclaim, so holding that and then
   allocating memory will cause lockdep to complain. Now that we
   unconditionally grab the mem_access.lock around mem_access_{get,put}, we
   need to change the ordering wrt to grabbing the ct->lock, since some of
   the runtime_pm routines can allocate memory (or at least that's what
   lockdep seems to suggest). Hopefully not a big deal.  It might be that
   there were already issues with this, just that the atomics where
   "hiding" the potential issues.
v3:
 - Use Thomas Hellström' idea with tracking the active task that is
   executing in the resume or suspend callback, in order to avoid
   recursive resume/suspend calls deadlocking on itself.
 - Split the ct->lock change.
v4:
 - Add smb_mb() around accessing the pm_callback_task for extra safety.
   (Thomas Hellström)
v5:
 - Clarify the kernel-doc for the mem_access.lock, given that it is quite
   strange in what it protects (data vs code). The real motivation is to
   aid lockdep. (Rodrigo Vivi)
v6:
 - Split out the lock change. We still want this as a lockdep aid but
   only for the xe_device_mem_access_get() path. Sticking a lock on the
   put() looks be a no-go, also the runtime_put() there is always async.
 - Now that the lock is gone move to atomics and rely on the pm code
   serialising multiple callers on the 0 -> 1 transition.
 - g2h_worker_func() looks to be the next issue, given that
   suspend-resume callbacks are using CT, so try to handle that.
v7:
 - Add xe_device_mem_access_get_if_ongoing(), and use it in
   g2h_worker_func().
v8 (Anshuman):
 - Just always grab the rpm, instead of just on the 0 -> 1 transition,
   which is a lot clearer and simplifies the code quite a bit.
v9:
 - Make sure we also adjust the CT fast-path with if-active.

Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/258Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Anshuman Gupta <anshuman.gupta@intel.com>
Acked-by: Anshuman Gupta <anshuman.gupta@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

a00b8f1a

drm/xe/pm: Init pcode and restore vram on power lost · 09d88e3b

Anshuman Gupta authored Jul 18, 2023

Don't init pcode and restore VRAM objects in vain.
We can rely on primary GT GUC_STATUS to detect whether
card has really lost power even when d3cold is allowed by xe.
Adding d3cold.lost_power flag to avoid pcode init and vram
restoration.
Also cleaning up the TODO code comment.

v2:
- %s/xe_guc_has_lost_power()/xe_guc_in_reset().
- Used existing gt instead of new variable. [Rodrigo]
- Added kernel-doc function comment. [Rodrigo]
- xe_guc_in_reset() return true if failed to get fw.

Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Anshuman Gupta <anshuman.gupta@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230718080703.239343-6-anshuman.gupta@intel.comSigned-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>

09d88e3b