Commits · 239bbb2fe927ed762bfe6307ba6a2e2d94e739da · Kirill Smelkov / linux

11 Mar, 2022 2 commits

drm/i915/gt: Remove GEN12_SFC_DONE_MAX from register defs header · 239bbb2f

Matt Roper authored Mar 10, 2022

We shouldn't really be keeping track of how many SFC_DONE registers
our platforms can have, but rather how many SFC hardware units there can
be (each SFC unit will have one corresponding SFC_DONE register). So
drop the stray GEN12_SFC_DONE_MAX definition we had in the register
definition file and replace it with an I915_MAX_SFC that follows the
pattern we use for other hardware units. Note that our hardware has a
2:1:1 ratio of VD:VE:SFC, and as far as we know that pattern should
carry forward to future platforms, so we'll define it as #VCS/2.

Cc: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220311062835.163744-1-matthew.d.roper@intel.com

239bbb2f

drm/i915/gem: add missing boundary check in vm_access · 661412e3

Mastan Katragadda authored Mar 03, 2022

A missing bounds check in vm_access() can lead to an out-of-bounds read
or write in the adjacent memory area, since the len attribute is not
validated before the memcpy later in the function, potentially hitting:

[  183.637831] BUG: unable to handle page fault for address: ffffc90000c86000
[  183.637934] #PF: supervisor read access in kernel mode
[  183.637997] #PF: error_code(0x0000) - not-present page
[  183.638059] PGD 100000067 P4D 100000067 PUD 100258067 PMD 106341067 PTE 0
[  183.638144] Oops: 0000 [#2] PREEMPT SMP NOPTI
[  183.638201] CPU: 3 PID: 1790 Comm: poc Tainted: G      D           5.17.0-rc6-ci-drm-11296+ #1
[  183.638298] Hardware name: Intel Corporation CoffeeLake Client Platform/CoffeeLake H DDR4 RVP, BIOS CNLSFWR1.R00.X208.B00.1905301319 05/30/2019
[  183.638430] RIP: 0010:memcpy_erms+0x6/0x10
[  183.640213] RSP: 0018:ffffc90001763d48 EFLAGS: 00010246
[  183.641117] RAX: ffff888109c14000 RBX: ffff888111bece40 RCX: 0000000000000ffc
[  183.642029] RDX: 0000000000001000 RSI: ffffc90000c86000 RDI: ffff888109c14004
[  183.642946] RBP: 0000000000000ffc R08: 800000000000016b R09: 0000000000000000
[  183.643848] R10: ffffc90000c85000 R11: 0000000000000048 R12: 0000000000001000
[  183.644742] R13: ffff888111bed190 R14: ffff888109c14000 R15: 0000000000001000
[  183.645653] FS:  00007fe5ef807540(0000) GS:ffff88845b380000(0000) knlGS:0000000000000000
[  183.646570] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  183.647481] CR2: ffffc90000c86000 CR3: 000000010ff02006 CR4: 00000000003706e0
[  183.648384] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  183.649271] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  183.650142] Call Trace:
[  183.650988]  <TASK>
[  183.651793]  vm_access+0x1f0/0x2a0 [i915]
[  183.652726]  __access_remote_vm+0x224/0x380
[  183.653561]  mem_rw.isra.0+0xf9/0x190
[  183.654402]  vfs_read+0x9d/0x1b0
[  183.655238]  ksys_read+0x63/0xe0
[  183.656065]  do_syscall_64+0x38/0xc0
[  183.656882]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[  183.657663] RIP: 0033:0x7fe5ef725142
[  183.659351] RSP: 002b:00007ffe1e81c7e8 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
[  183.660227] RAX: ffffffffffffffda RBX: 0000557055dfb780 RCX: 00007fe5ef725142
[  183.661104] RDX: 0000000000001000 RSI: 00007ffe1e81d880 RDI: 0000000000000005
[  183.661972] RBP: 00007ffe1e81e890 R08: 0000000000000030 R09: 0000000000000046
[  183.662832] R10: 0000557055dfc2e0 R11: 0000000000000246 R12: 0000557055dfb1c0
[  183.663691] R13: 00007ffe1e81e980 R14: 0000000000000000 R15: 0000000000000000

Changes since v1:
     - Updated if condition with range_overflows_t [Chris Wilson]

Fixes: 9f909e21 ("drm/i915: Implement vm_ops->access for gdb access into mmaps")
Signed-off-by: Mastan Katragadda <mastanx.katragadda@intel.com>
Suggested-by: Adam Zabrocki <adamza@microsoft.com>
Reported-by: Jackson Cody <cody.jackson@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Sudeep Dutt <sudeep.dutt@intel.com>
Cc: <stable@vger.kernel.org> # v5.8+
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
[mauld: tidy up the commit message and add Cc: stable]
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220303060428.1668844-1-mastanx.katragadda@intel.com

661412e3

08 Mar, 2022 3 commits

drm/i915/xehp: Drop aux table invalidation on FlatCCS platforms · 6639fabb

Matt Roper authored Feb 28, 2022

Platforms with FlatCCS do not use auxiliary planes for compression
control data and thus do not need traditional aux table invalidation
(and the registers no longer even exist).

Original-author: CQ Tang
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220301052952.1706597-1-matthew.d.roper@intel.com

6639fabb

drm/i915: opportunistically apply ALLOC_CONTIGIOUS · 2ed38cec

Matthew Auld authored Feb 02, 2022

It looks like this code was accidentally dropped at some point(in a
slightly different form), so add it back. The gist is that if we know
the allocation will be one single chunk, then we can just annotate the
BO with I915_BO_ALLOC_CONTIGUOUS, even if the user doesn't bother. In
the future this should allow us to avoid using vmap for such objects,
in some upcoming patches.

v2(Thomas):
  - Tweak the commit message to mention the future motivation
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220202173154.3758970-1-matthew.auld@intel.com

2ed38cec

drm/i915/gtt: reduce overzealous alignment constraints for GGTT · c64fa77d

Matthew Auld authored Mar 03, 2022

Currently this will enforce both 2M alignment and padding for any LMEM
pages inserted into the GGTT. However, this was only meant to be applied
to the compact-pt layout with the ppGTT. For the GGTT we can reduce the
alignment and padding to 64K.

Bspec: 45015
Fixes: 87bd701e ("drm/i915: enforce min GTT alignment for discrete cards")
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Robert Beckett <bob.beckett@collabora.com>
Cc: Ramalingam C <ramalingam.c@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220303100229.839282-1-matthew.auld@intel.com

c64fa77d

07 Mar, 2022 6 commits

drm/i915: stop checking for NULL vma->obj · e4b3ee71

Matthew Auld authored Mar 04, 2022

This is no longer possible since e6e1a304 ("drm/i915: vma is always
backed by an object.").
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220304174252.1000238-1-matthew.auld@intel.com

e4b3ee71

drm/i915: limit the async bind to bind_async_flags · 833124a0

Matthew Auld authored Mar 04, 2022

If the vm doesn't request async binding, like for example with the dpt,
then we should be able to skip the async path and avoid calling
i915_vm_lock_objects() altogether. Currently if we have a moving fence
set for the BO(even though it might have signalled), we still take the
async patch regardless of the bind_async setting, and then later still
end up just doing i915_gem_object_wait_moving_fence() anyway.

Alternatively we would need to add dummy scratch object which can be
locked, just for the dpt.
Suggested-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220304095934.925036-2-matthew.auld@intel.com

833124a0

drm/i915/fbdev: fixup setting screen_size · 892bfb8a

Matthew Auld authored Mar 04, 2022

Since we are actually mapping the object and not the vma, when dealing
with LMEM, we should be careful and use the backing store size here,
since the vma->node.size could have all kinds of funny padding
constraints, which could result in us writing to OOB address.

v2(Chris):
  - Prefer vma->size here, which should be the backing store size. Some
    more rework is needed here to stop using node.size in some other
    places.
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220304095934.925036-1-matthew.auld@intel.com

892bfb8a

drm/i915/gem: Remove some unnecessary code · eb950819

Thomas Hellström authored Mar 04, 2022

The test for vma should always return true, and when assigning -EBUSY
to ret, the variable should already have that value.
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220304082641.308069-4-thomas.hellstrom@linux.intel.com

eb950819

drm/i915: Remove the vma refcount · d9393973

Thomas Hellström authored Mar 04, 2022

Now that i915_vma_parked() is taking the object lock on vma destruction,
and the only user of the vma refcount, i915_gem_object_unbind()
also takes the object lock, remove the vma refcount.

v3: Documentation update.
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220304082641.308069-3-thomas.hellstrom@linux.intel.com

d9393973

drm/i915: Remove the vm open count · e1a7ab4f

Thomas Hellström authored Mar 04, 2022

vms are not getting properly closed. Rather than fixing that,
Remove the vm open count and instead rely on the vm refcount.

The vm open count existed solely to break the strong references the
vmas had on the vms. Now instead make those references weak and
ensure vmas are destroyed when the vm is destroyed.

Unfortunately if the vm destructor and the object destructor both
wants to destroy a vma, that may lead to a race in that the vm
destructor just unbinds the vma and leaves the actual vma destruction
to the object destructor. However in order for the object destructor
to ensure the vma is unbound it needs to grab the vm mutex. In order
to keep the vm mutex alive until the object destructor is done with
it, somewhat hackishly grab a vm_resv refcount that is released late
in the vma destruction process, when the vm mutex is no longer needed.

v2: Address review-comments from Niranjana
- Clarify that the struct i915_address_space::skip_pte_rewrite is a hack
and should ideally be replaced in an upcoming patch.
- Remove an unneeded continue in clear_vm_list and update comment.

v3:
- Documentation update
- Commit message formatting
Co-developed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220304082641.308069-2-thomas.hellstrom@linux.intel.com

e1a7ab4f

06 Mar, 2022 2 commits

drm/i915/dmabuf: Fix prime_mmap to work when using LMEM · d028a769

Gwan-gyeong Mun authored Feb 25, 2022

The current implementation of i915 prime mmap only works when initializing
drm_i915_gem_object with shmem_region.
When using LMEM, drm_i915_gem_object is initialized with ttm_system_region.
In order to make prime mmap work even this case, when using LMEM
(when using ttm in i915), dma_buf_ops.mmap callback function calls
drm_gem_prime_mmap(). drm_gem_prime_mmap() of drm core calls internally
i915_gem_mmap() so that prime mmap can perform normally.
The fake offset is processed inside drm_gem_prime_mmap().

Testcase: igt/prime_mmap

Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Signed-off-by: Ramalingam C <ramalingam.c@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220225131316.1433515-3-gwan-gyeong.mun@intel.com

d028a769

drm/i915/dmabuf: Update dma_buf_ops.unmap_dma_buf callback to use drm_gem_unmap_dma_buf() · dcb62550

Gwan-gyeong Mun authored Feb 25, 2022

The dma_buf_ops.unmap_dma_buf callback used in i915,
i915_gem_unmap_dma_buf(), has the same code as drm_gem_unmap_dma_buf().
In order to eliminate defining and using duplicate function, it updates
the dma_buf_ops.unmap_dma_buf callback to use drm_gem_unmap_dma_buf().
Signed-off-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Signed-off-by: Ramalingam C <ramalingam.c@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220225131316.1433515-2-gwan-gyeong.mun@intel.com

dcb62550

04 Mar, 2022 2 commits

drm/i915: Add RCS mask to GuC ADS params · 18ac067b

Stuart Summers authored Mar 03, 2022

If RCS is not enumerated, GuC will return invalid parameters.
Make sure we do not send RCS supported when we have not enumerated
it.

Cc: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Signed-off-by: Stuart Summers <stuart.summers@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220303223435.2793124-2-matthew.d.roper@intel.com

18ac067b

drm/i915/xehp: Support platforms with CCS engines but no RCS · f9576e36

Matt Roper authored Mar 03, 2022

In the past we've always assumed that an RCS engine is present on every
platform.  However now that we have compute engines there may be
platforms that have CCS engines but no RCS, or platforms that are
designed to have both, but have the RCS engine fused off.

Various engine-centric initialization that only needs to be done a
single time for the group of RCS+CCS engines can't rely on being setup
with the RCS now; instead we add a I915_ENGINE_FIRST_RENDER_COMPUTE flag
that will be assigned to a single engine in the group; whichever engine
has this flag will be responsible for some of the general setup
(RCU_MODE programming, initialization of certain workarounds, etc.).
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220303223435.2793124-1-matthew.d.roper@intel.com

f9576e36

03 Mar, 2022 8 commits

drm/i915/guc: Fix potential invalid pointer dereferences when decoding G2Hs · e1dd8714

John Harrison authored Mar 01, 2022

Some G2H handlers were reading the context id field from the payload
before checking the payload met the minimum length required.
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220302003357.4188363-9-John.C.Harrison@Intel.com

e1dd8714

drm/i915/guc: Drop obsolete H2G definitions · d4de9a3e

John Harrison authored Mar 01, 2022

The CTB registration process changed significantly a while back using
a single KLV based H2G. So drop the original and now obsolete H2G
definitions.
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220302003357.4188363-8-John.C.Harrison@Intel.com

d4de9a3e

drm/i915/guc: Rename desc_idx to ctx_id · 77dcbffb

John Harrison authored Mar 01, 2022

The LRC descriptor pool is going away. So, stop naming context ids as
descriptor pool indecies.

While at it, add a bunch of missing line feeds to some error messages.
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220302003357.4188363-7-John.C.Harrison@Intel.com

77dcbffb

drm/i915/guc: Move lrc desc setup to where it is needed · 8e2e9c43

John Harrison authored Mar 01, 2022

The LRC descriptor was being initialised early on in the context
registration sequence. It could then be determined that the actual
registration needs to be delayed and the descriptor would be wiped
out. This is inefficient, so move the setup to later in the process
after the point of no return.

v2: Move some split changes into the split patch (and do them
correctly).
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220302003357.4188363-6-John.C.Harrison@Intel.com

8e2e9c43

drm/i915/guc: Split guc_lrc_desc_pin apart · 58ea7d62

John Harrison authored Mar 01, 2022

The LRC descriptor pool is going away. Further, the function that was
populating it was also doing a bunch of logic about the context
registration sequence. So, split that code apart into separate state
setup and try to register functions. Note that some of those 'try to
register' code paths actually undo the state setup and leave it to be
redone again later (with potentially different values). This is
inefficient. The next patch will correct this.

Also, move a comment about ignoring return values to the place where
the return values are actually ignored.

v2: Move some more splitting from a later patch (and do it correctly).
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220302003357.4188363-5-John.C.Harrison@Intel.com

58ea7d62

drm/i915/guc: Better name for context id limit · d1249022

John Harrison authored Mar 01, 2022

The LRC descriptor pool is going away. So, stop using it as the limit
for how many context ids are available. Instead, size the pool
according to the number of contexts allowed. Note that this is just a
naming change, the actual limit is identical in value.

While at it, also update a kzalloc(sizeof()*count) to be a
kcalloc(count,size).
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220302003357.4188363-4-John.C.Harrison@Intel.com

d1249022

drm/i915/guc: Add an explicit 'submission_initialized' flag · 09570c50

John Harrison authored Mar 01, 2022

The LRC descriptor pool is going away. So, stop using it as a check
for whether submission has been initialised or not.
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220302003357.4188363-3-John.C.Harrison@Intel.com

09570c50

drm/i915/guc: Do not conflate lrc_desc with GuC id for registration · 02942b42

John Harrison authored Mar 01, 2022

The LRC descriptor pool is going away. So, stop using it as a check for
context registration, use the GuC id instead (being the thing that
actually gets registered with the GuC).

Also, rename the set/clear/query helper functions for context id
mappings to better reflect their purpose and to differentiate from
other registration related helper functions.
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220302003357.4188363-2-John.C.Harrison@Intel.com

02942b42

02 Mar, 2022 14 commits

drm/i915/xehpsdv: Move render/compute engine reset domains related workarounds · b2006061

Srinivasan Shanmugam authored Mar 01, 2022

Registers that exist in the shared render/compute reset domain need to
be placed on an engine workaround list to ensure that they are properly
re-applied whenever an RCS or CCS engine is reset.  We have a number of
workarounds (updating registers MLTICTXCTL, L3SQCREG1_CCS0,
GEN12_MERT_MOD_CTRL, and GEN12_GAMCNTRL_CTRL) that are incorrectly
implemented on the 'gt' workaround list and need to be moved
accordingly.

Cc: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.s@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220301231549.1817978-14-matthew.d.roper@intel.com

b2006061

drm/i915/xehp: Add compute workarounds · ff6b19d3

Matt Roper authored Mar 01, 2022

Additional workarounds are required once we start exposing CCS engines.

Note that we have a number of workarounds that update registers in the
shared render/compute reset domain. Historically we've just added such
registers to the RCS engine's workaround list. But going forward we
should be more careful to place such workarounds on a wa_list for an
engine that definitely exists and is not fused off (e.g., a platform
with no RCS would never apply the RCS wa_list). We'll keep
rcs_engine_wa_init() focused on RCS-specific workarounds that only need
to be applied if the RCS engine is present. A separate
general_render_compute_wa_init() function will be used to define
workarounds that touch registers in the shared render/compute reset
domain and that we need to apply regardless of what render and/or
compute engines actually exist. Any workarounds defined in this new
function will internally be added to the first present RCS or CCS
engine's workaround list to ensure they get applied (and only get
applied once rather than being needlessly re-applied several times).

Co-author: Srinivasan Shanmugam
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220301231549.1817978-13-matthew.d.roper@intel.com

ff6b19d3

drm/i915/xehp: handle fused off CCS engines · 88ed07cb

Daniele Ceraolo Spurio authored Mar 01, 2022

HW resources are divided across the active CCS engines at the compute
slice level, with each CCS having priority on one of the cslices.
If a compute slice has no enabled DSS, its paired compute engine is not
usable in full parallel execution because the other ones already fully
saturate the HW, so consider it fused off.

v2 (José):
 - moved it to its own function
 - fixed definition of ccs_mask

v3 (Matt):
 - Replace fls() condition with a simple IP version test

v4 (Matt):
 - Don't try to calculate a ccs_mask using
   intel_slicemask_from_dssmask() until we've determined that we're
   running on an Xe_HP platform where the logic makes sense (and won't
   overflow).

Cc: Stuart Summers <stuart.summers@intel.com>
Cc: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Cc: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Stuart Summers <stuart.summers@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220302052008.1884985-1-matthew.d.roper@intel.com

88ed07cb

drm/i915/xehp: Don't support parallel submission on compute / render · e393e2aa

Matthew Brost authored Mar 01, 2022

A different emit breadcrumbs ring programming is required for compute /
render and we don't have UMD user so just reject parallel submission for
these engine classes.
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220301231549.1817978-11-matthew.d.roper@intel.com

e393e2aa

drm/i915/xehp/guc: enable compute engine inside GuC · ea4ca894

Daniele Ceraolo Spurio authored Mar 01, 2022

Tell GuC that CCS is enabled by setting the CCS mask in its ADS.

Cc: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Original-author: Michel Thierry
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220301231549.1817978-10-matthew.d.roper@intel.com

ea4ca894

drm/i915/xehp: Enable ccs/dual-ctx in RCU_MODE · 87cb6d80

Matt Roper authored Mar 01, 2022

We have to specify in the Render Control Unit Mode register
when CCS is enabled.

v2:
 - Move RCU_MODE programming to a helper function.  (Tvrtko)
 - Clean up and clarify comments.  (Tvrtko)
 - Add RCU_MODE to the GuC save/restore list.  (Daniele)
v3:
 - Move this patch before the GuC ADS update to enable compute engines;
   the definition of RCU_MODE and its insertion into the save/restore
   list moves to this patch.  (Daniele)
v4:
 - Call xehp_enable_ccs_engines() directly in guc_resume() and
   execlists_resume() rather than adding an extra layer of wrapping to
   the engine->resume() vfunc.  (Umesh)

Bspec: 46034
Original-author: Michel Thierry
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Cc: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Aravind Iddamsetty <aravind.iddamsetty@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220302001554.1836066-1-matthew.d.roper@intel.com

87cb6d80

drm/i915/xehp: Define context scheduling attributes in lrc descriptor · adfadb56

Matt Roper authored Mar 01, 2022

In Dual Context mode the EUs are shared between render and compute
command streamers. The hardware provides a field in the lrc descriptor
to indicate the prioritization of the thread dispatch associated to the
corresponding context.

The context priority is set to 'low' at creation time and relies on the
existing context priority to set it to low/normal/high.

Bspec: 46145, 46260
Original-author: Michel Thierry
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Signed-off-by: Aravind Iddamsetty <aravind.iddamsetty@intel.com>
Signed-off-by: Prasad Nallani <prasad.nallani@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220301231549.1817978-8-matthew.d.roper@intel.com

adfadb56

drm/i915: Move context descriptor fields to intel_lrc.h · f4c1fdb9

Matt Roper authored Mar 01, 2022

This is a more appropriate header for these definitions.

v2:
 - Cleanup whitespace. (Lucas)
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220301231549.1817978-7-matthew.d.roper@intel.com

f4c1fdb9

drm/i915/xehp: CCS should use RCS setup functions · c674c5b9

Matt Roper authored Mar 01, 2022

The compute engine handles the same commands the render engine can
(except 3D pipeline), so it makes sense that CCS is more similar to RCS
than non-render engines.

The CCS context state (lrc) is also similar to the render one, so reuse
it. Note that the compute engine has its own CTX_R_PWR_CLK_STATE
register.

In order to avoid having multiple RCS && CCS checks, add the following
engine flag:
 - I915_ENGINE_HAS_RCS_REG_STATE - use the render (larger) reg state ctx.

BSpec: 46260
Original-author: Michel Thierry
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Aravind Iddamsetty <aravind.iddamsetty@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220301231549.1817978-6-matthew.d.roper@intel.com

c674c5b9

drm/i915/xehp: compute engine pipe_control · 803efd29

Daniele Ceraolo Spurio authored Mar 01, 2022

CCS will reuse the RCS functions for breadcrumb and flush emission.
However, CCS pipe_control has additional programming restrictions:
 - Command Streamer Stall Enable must be always set
 - Post Sync Operations must not be set to Write PS Depth Count
 - 3D-related bits must not be set

v2:
 - Drop unwanted blank line.  (Lucas)

Bspec: 47112
Cc: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Aravind Iddamsetty <aravind.iddamsetty@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220301231549.1817978-5-matthew.d.roper@intel.com

803efd29

drm/i915/xehp: Add Compute CS IRQ handlers · 505c4857

Matt Roper authored Mar 01, 2022

Add execlists and GuC interrupts for compute CS into existing IRQ handlers.

All compute command streamers belong to the same compute class, so the
only change needed to enable their interrupts is to program their GT engine
interrupt mask registers.

CCS0 shares the register with CCS1, while CCS2 and CCS3 are in a new one.

BSpec: 50844, 54029, 54030, 53223, 53224.
Original-author: Michel Thierry
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Aravind Iddamsetty <aravind.iddamsetty@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220301231549.1817978-4-matthew.d.roper@intel.com

505c4857

drm/i915/xehp: CCS shares the render reset domain · 4b88ad50

Matt Roper authored Mar 01, 2022

The reset domain is shared between render and all compute engines,
so resetting one will affect the others.

Note:  Before performing a reset on an RCS or CCS engine, the GuC will
attempt to preempt-to-idle the other non-hung RCS/CCS engines to avoid
impacting other clients (since some shared modules will be reset).  If
other engines are executing non-preemptable workloads, the impact is
unavoidable and some work may be lost.

Bspec: 52549
Original-author: Michel Thierry
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Aravind Iddamsetty <aravind.iddamsetty@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220301231549.1817978-3-matthew.d.roper@intel.com

4b88ad50

drm/i915/xehp: Define compute class and engine · 944823c9

Matt Roper authored Mar 01, 2022

Introduce a Compute Command Streamer (CCS), which has access to
the media and GPGPU pipelines (but not the 3D pipeline).

To begin with, define the compute class/engine common functions, based
on the existing render ones.

v2:
 - Add kerneldoc for drm_i915_gem_engine_class since we're adding a new
   element to it.  (Daniel)
 - Make engine class <-> guc class converters use lookup tables to make
   it more clear/explicit how the IDs map.  (Tvrtko)

v3:
 - Don't update uapi for now; we'll just include the driver-internal
   changes for the time being.

Bspec: 46167, 45544
Original-author: Michel Thierry
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Aravind Iddamsetty <aravind.iddamsetty@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> #v1
Link: https://patchwork.freedesktop.org/patch/msgid/20220301231549.1817978-2-matthew.d.roper@intel.com

944823c9

drm/i915: Depend on !PREEMPT_RT. · a8b2b8b0

Sebastian Andrzej Siewior authored Feb 14, 2022

There are a few sections in the driver which are not compatible with
PREEMPT_RT. They trigger warnings and can lead to deadlocks at runtime.

Disable the i915 driver on a PREEMPT_RT enabled kernel. This way
PREEMPT_RT itself can be enabled without needing to address the i915
issues first. The RT related patches are still in RT queue and will be
handled later.
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/YgqmfKhwU5spS069@linutronix.de

a8b2b8b0

01 Mar, 2022 3 commits

drm/i915/guc: Do not complain about stale reset notifications · e2a1e7ab

John Harrison authored Feb 24, 2022

It is possible for reset notifications to arrive for a context that is
in the process of being banned. So don't flag these as an error, just
report it as informational (because it is still useful to know that
resets are happening even if they are being ignored).

v2: Better wording for the message (review feedback from Tvrtko).
v3: Fix rebase issue (review feedback from Daniele).
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220225015232.1939497-1-John.C.Harrison@Intel.com

e2a1e7ab

drm/i915/guc: Initialize GuC submission locks and queues early · e068ef3f

Daniele Ceraolo Spurio authored Feb 14, 2022

Move initialization of submission-related spinlock, lists and workers to
init_early. This fixes an issue where if the GuC init fails we might
still try to get the lock in the context cleanup code. Note that it is
safe to call the GuC context cleanup code even if the init failed
because all contexts are initialized with an invalid GuC ID, which will
cause the GuC side of the cleanup to be skipped, so it is easier to just
make sure the variables are initialized than to special case the cleanup
to handle the case when they're not.

References: https://gitlab.freedesktop.org/drm/intel/-/issues/4932Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220215011123.734572-1-daniele.ceraolospurio@intel.com

e068ef3f

drm/i915/guc: Fix flag query helper function to not modify state · eee5215b

John Harrison authored Feb 17, 2022

A flag query helper was actually writing to the flags word rather than
just reading. Fix that. Also update the function's comment as it was
out of date.

NB: No need for a 'Fixes' tag. The test was only ever used inside a
BUG_ON during context registration. Rather than asserting that the
condition was true, it was making the condition true. So, in theory,
there was no consequence because we should never have hit a BUG_ON
anyway. Which means the write should always have been a no-op.
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220217212942.629922-1-John.C.Harrison@Intel.com

eee5215b