Commit c6a3d735 authored by Dave Airlie's avatar Dave Airlie

Merge tag 'drm-intel-gt-next-2022-06-29' of...

Merge tag 'drm-intel-gt-next-2022-06-29' of git://anongit.freedesktop.org/drm/drm-intel into drm-next

UAPI Changes:

- Expose per tile media freq factor in sysfs (Ashutosh Dixit, Dale B Stimson)
- Document memory residency and Flat-CCS capability of obj (Ramalingam C)
- Disable GETPARAM lookups of I915_PARAM_[SUB]SLICE_MASK on Xe_HP+ (Matt Roper)

Cross-subsystem Changes:

- Rename intel-gtt symbols (Lucas De Marchi)

Core Changes:

Driver Changes:

- Support programming the EU priority in the GuC descriptor (DG2) (Matthew Brost)
- DG2 HuC loading support (Daniele Ceraolo Spurio)
- Fix build error without CONFIG_PM (YueHaibing)
- Enable THP on Icelake and beyond (Tvrtko Ursulin)
- Only setup private tmpfs mount when needed and fix logging (Tvrtko Ursulin)
- Make __guc_reset_context aware of guilty engines (Umesh Nerlige Ramappa)
- DG2 small bar memory probing fixes (Nirmoy Das)
- Remove unnecessary GuC err capture noise (Alan Previn)
- Fix i915_gem_object_ggtt_pin_ww regression on old platforms (Maarten Lankhorst)
- Fix undefined behavior in GuC backend due to shift overflowing the constant (Borislav Petkov)
- New DG2 workarounds (Swathi Dhanavanthri, Anshuman Gupta)
- Report no hwconfig support on ADL-N (Balasubramani Vivekanandan)
- Fix error_state_read ptr + offset use (Alan Previn)
- Expose per tile media freq factor in sysfs (Ashutosh Dixit, Dale B Stimson)
- Fix memory leaks in per-gt sysfs (Ashutosh Dixit)
- Fix dma_resv fence handling in multi-batch execbuf (Nirmoy Das)
- Add extra registers to GPU error dump on Gen11+ (Stuart Summers)
- More PVC+DG2 workarounds (Matt Roper)
- Improve user experience and driver robustness under SIGINT or similar (Tvrtko Ursulin)
- Don't show engine classes not present (Tvrtko Ursulin)
- Improve on suspend / resume time with VT-d enabled (Thomas Hellström)
- Add missing else (katrinzhou)
- Don't leak lmem mapping in vma_evict (Juha-Pekka Heikkila)
- Add smem fallback allocation for dpt (Juha-Pekka Heikkila)
- Tweak the ordering in cpu_write_needs_clflush (Matthew Auld)
- Do not access rq->engine without a reference (Niranjana Vishwanathapura)
- Revert "drm/i915: Hold reference to intel_context over life of i915_request" (Niranjana Vishwanathapura)
- Don't update engine busyness stats too frequently (Alan Previn)
- Add additional steps for Wa_22011802037 for execlist backend (Umesh Nerlige Ramappa)
- Fix a lockdep warning at error capture (Nirmoy Das)

- Ponte Vecchio prep work and new blitter engines (Matt Roper, John Harrison, Lucas De Marchi)
- Read correct RP_STATE_CAP register (PVC) (Matt Roper)
- Define MOCS table for PVC (Ayaz A Siddiqui)
- Driver refactor and support Ponte Vecchio forcewake handling (Matt Roper)
- Remove additional 3D flags from PIPE_CONTROL (Ponte Vecchio) (Stuart Summers)
- XEHPSDV and PVC do not use HuC (Daniele Ceraolo Spurio)
- Extract stepping information from PCI revid (Ponte Vecchio) (Matt Roper)
- Add initial PVC workarounds (Stuart Summers)
- SSEU handling driver refactor and Ponte Vecchio support (Matt Roper)
- GuC depriv applies to PVC (Matt Roper)
- Add register steering (Ponte Vecchio) (Matt Roper)
- Add recommended MMIO setting (Ponte Vecchio) (Matt Roper)

- Move multicast register handling to a dedicated file (Matt Roper)
- Cleanup interface for MCR operations (Matt Roper)
- Extend i915_vma_pin_iomap() (CQ Tang)
- Re-do the intel-gtt split (Lucas De Marchi)
- Correct duplicated/misplaced GT register definitions (Matt Roper)
- Prefer "XEHP_" prefix for registers (Matt Roper)

- Don't use DRM_DEBUG_WARN_ON for unexpected l3bank/mslice config (Tvrtko Ursulin)
- Don't use DRM_DEBUG_WARN_ON for ring unexpectedly not idle (Tvrtko Ursulin)
- Make drop_pages() return bool (Lucas De Marchi)
- Fix CFI violation with show_dynamic_id() (Nathan Chancellor)
- Use i915_probe_error instead of drm_error in GuC code (Vinay Belgaumkar)
- Fix use of static in macro mismatch (Andi Shyti)
- Update tiled blits selftest (Bommu Krishnaiah)
- Future-proof platform checks (Matt Roper)
- Only include what's needed (Jani Nikula)
- remove accidental static from a local variable (Jani Nikula)
- Add global forcewake request to drpc (Vinay Belgaumkar)
- Fix spelling typo in comment (pengfuyuan)
- Increase timeout for live_parallel_switch selftest (Akeem G Abodunrin)
- Use non-blocking H2G for waitboost (Vinay Belgaumkar)
Signed-off-by: default avatarDave Airlie <airlied@redhat.com>
From: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/YrwtLM081SQUG1Dc@tursulin-desk
parents f9292174 a0696856
......@@ -246,6 +246,18 @@ Display State Buffer
.. kernel-doc:: drivers/gpu/drm/i915/display/intel_dsb.c
:internal:
GT Programming
==============
Multicast/Replicated (MCR) Registers
------------------------------------
.. kernel-doc:: drivers/gpu/drm/i915/gt/intel_gt_mcr.c
:doc: GT Multicast/Replicated (MCR) Register Support
.. kernel-doc:: drivers/gpu/drm/i915/gt/intel_gt_mcr.c
:internal:
Memory Management and Command Submission
========================================
......
......@@ -744,7 +744,7 @@ static void i830_write_entry(dma_addr_t addr, unsigned int entry,
writel_relaxed(addr | pte_flags, intel_private.gtt + entry);
}
bool intel_enable_gtt(void)
bool intel_gmch_enable_gtt(void)
{
u8 __iomem *reg;
......@@ -787,7 +787,7 @@ bool intel_enable_gtt(void)
return true;
}
EXPORT_SYMBOL(intel_enable_gtt);
EXPORT_SYMBOL(intel_gmch_enable_gtt);
static int i830_setup(void)
{
......@@ -821,8 +821,8 @@ static int intel_fake_agp_free_gatt_table(struct agp_bridge_data *bridge)
static int intel_fake_agp_configure(void)
{
if (!intel_enable_gtt())
return -EIO;
if (!intel_gmch_enable_gtt())
return -EIO;
intel_private.clear_fake_agp = true;
agp_bridge->gart_bus_addr = intel_private.gma_bus_addr;
......@@ -844,20 +844,20 @@ static bool i830_check_flags(unsigned int flags)
return false;
}
void intel_gtt_insert_page(dma_addr_t addr,
unsigned int pg,
unsigned int flags)
void intel_gmch_gtt_insert_page(dma_addr_t addr,
unsigned int pg,
unsigned int flags)
{
intel_private.driver->write_entry(addr, pg, flags);
readl(intel_private.gtt + pg);
if (intel_private.driver->chipset_flush)
intel_private.driver->chipset_flush();
}
EXPORT_SYMBOL(intel_gtt_insert_page);
EXPORT_SYMBOL(intel_gmch_gtt_insert_page);
void intel_gtt_insert_sg_entries(struct sg_table *st,
unsigned int pg_start,
unsigned int flags)
void intel_gmch_gtt_insert_sg_entries(struct sg_table *st,
unsigned int pg_start,
unsigned int flags)
{
struct scatterlist *sg;
unsigned int len, m;
......@@ -879,13 +879,13 @@ void intel_gtt_insert_sg_entries(struct sg_table *st,
if (intel_private.driver->chipset_flush)
intel_private.driver->chipset_flush();
}
EXPORT_SYMBOL(intel_gtt_insert_sg_entries);
EXPORT_SYMBOL(intel_gmch_gtt_insert_sg_entries);
#if IS_ENABLED(CONFIG_AGP_INTEL)
static void intel_gtt_insert_pages(unsigned int first_entry,
unsigned int num_entries,
struct page **pages,
unsigned int flags)
static void intel_gmch_gtt_insert_pages(unsigned int first_entry,
unsigned int num_entries,
struct page **pages,
unsigned int flags)
{
int i, j;
......@@ -905,7 +905,7 @@ static int intel_fake_agp_insert_entries(struct agp_memory *mem,
if (intel_private.clear_fake_agp) {
int start = intel_private.stolen_size / PAGE_SIZE;
int end = intel_private.gtt_mappable_entries;
intel_gtt_clear_range(start, end - start);
intel_gmch_gtt_clear_range(start, end - start);
intel_private.clear_fake_agp = false;
}
......@@ -934,12 +934,12 @@ static int intel_fake_agp_insert_entries(struct agp_memory *mem,
if (ret != 0)
return ret;
intel_gtt_insert_sg_entries(&st, pg_start, type);
intel_gmch_gtt_insert_sg_entries(&st, pg_start, type);
mem->sg_list = st.sgl;
mem->num_sg = st.nents;
} else
intel_gtt_insert_pages(pg_start, mem->page_count, mem->pages,
type);
intel_gmch_gtt_insert_pages(pg_start, mem->page_count, mem->pages,
type);
out:
ret = 0;
......@@ -949,7 +949,7 @@ static int intel_fake_agp_insert_entries(struct agp_memory *mem,
}
#endif
void intel_gtt_clear_range(unsigned int first_entry, unsigned int num_entries)
void intel_gmch_gtt_clear_range(unsigned int first_entry, unsigned int num_entries)
{
unsigned int i;
......@@ -959,7 +959,7 @@ void intel_gtt_clear_range(unsigned int first_entry, unsigned int num_entries)
}
wmb();
}
EXPORT_SYMBOL(intel_gtt_clear_range);
EXPORT_SYMBOL(intel_gmch_gtt_clear_range);
#if IS_ENABLED(CONFIG_AGP_INTEL)
static int intel_fake_agp_remove_entries(struct agp_memory *mem,
......@@ -968,7 +968,7 @@ static int intel_fake_agp_remove_entries(struct agp_memory *mem,
if (mem->page_count == 0)
return 0;
intel_gtt_clear_range(pg_start, mem->page_count);
intel_gmch_gtt_clear_range(pg_start, mem->page_count);
if (intel_private.needs_dmar) {
intel_gtt_unmap_memory(mem->sg_list, mem->num_sg);
......@@ -1431,22 +1431,22 @@ int intel_gmch_probe(struct pci_dev *bridge_pdev, struct pci_dev *gpu_pdev,
}
EXPORT_SYMBOL(intel_gmch_probe);
void intel_gtt_get(u64 *gtt_total,
phys_addr_t *mappable_base,
resource_size_t *mappable_end)
void intel_gmch_gtt_get(u64 *gtt_total,
phys_addr_t *mappable_base,
resource_size_t *mappable_end)
{
*gtt_total = intel_private.gtt_total_entries << PAGE_SHIFT;
*mappable_base = intel_private.gma_bus_addr;
*mappable_end = intel_private.gtt_mappable_entries << PAGE_SHIFT;
}
EXPORT_SYMBOL(intel_gtt_get);
EXPORT_SYMBOL(intel_gmch_gtt_get);
void intel_gtt_chipset_flush(void)
void intel_gmch_gtt_flush(void)
{
if (intel_private.driver->chipset_flush)
intel_private.driver->chipset_flush();
}
EXPORT_SYMBOL(intel_gtt_chipset_flush);
EXPORT_SYMBOL(intel_gmch_gtt_flush);
void intel_gmch_remove(void)
{
......
......@@ -103,6 +103,7 @@ gt-y += \
gt/intel_gt_debugfs.o \
gt/intel_gt_engines_debugfs.o \
gt/intel_gt_irq.o \
gt/intel_gt_mcr.o \
gt/intel_gt_pm.o \
gt/intel_gt_pm_debugfs.o \
gt/intel_gt_pm_irq.o \
......@@ -129,7 +130,7 @@ gt-y += \
gt/shmem_utils.o \
gt/sysfs_engines.o
# x86 intel-gtt module support
gt-$(CONFIG_X86) += gt/intel_gt_gmch.o
gt-$(CONFIG_X86) += gt/intel_ggtt_gmch.o
# autogenerated null render state
gt-y += \
gt/gen6_renderstate.o \
......
......@@ -4,6 +4,7 @@
*/
#include "gem/i915_gem_domain.h"
#include "gem/i915_gem_internal.h"
#include "gt/gen8_ppgtt.h"
#include "i915_drv.h"
......@@ -127,8 +128,12 @@ struct i915_vma *intel_dpt_pin(struct i915_address_space *vm)
struct i915_vma *vma;
void __iomem *iomem;
struct i915_gem_ww_ctx ww;
u64 pin_flags = 0;
int err;
if (i915_gem_object_is_stolen(dpt->obj))
pin_flags |= PIN_MAPPABLE;
wakeref = intel_runtime_pm_get(&i915->runtime_pm);
atomic_inc(&i915->gpu_error.pending_fb_pin);
......@@ -138,7 +143,7 @@ struct i915_vma *intel_dpt_pin(struct i915_address_space *vm)
continue;
vma = i915_gem_object_ggtt_pin_ww(dpt->obj, &ww, NULL, 0, 4096,
HAS_LMEM(i915) ? 0 : PIN_MAPPABLE);
pin_flags);
if (IS_ERR(vma)) {
err = PTR_ERR(vma);
continue;
......@@ -248,10 +253,13 @@ intel_dpt_create(struct intel_framebuffer *fb)
size = round_up(size * sizeof(gen8_pte_t), I915_GTT_PAGE_SIZE);
if (HAS_LMEM(i915))
dpt_obj = i915_gem_object_create_lmem(i915, size, I915_BO_ALLOC_CONTIGUOUS);
else
dpt_obj = i915_gem_object_create_lmem(i915, size, I915_BO_ALLOC_CONTIGUOUS);
if (IS_ERR(dpt_obj) && i915_ggtt_has_aperture(to_gt(i915)->ggtt))
dpt_obj = i915_gem_object_create_stolen(i915, size);
if (IS_ERR(dpt_obj) && !HAS_LMEM(i915)) {
drm_dbg_kms(&i915->drm, "Allocating dpt from smem\n");
dpt_obj = i915_gem_object_create_internal(i915, size);
}
if (IS_ERR(dpt_obj))
return ERR_CAST(dpt_obj);
......
......@@ -933,8 +933,9 @@ static int set_proto_ctx_param(struct drm_i915_file_private *fpriv,
case I915_CONTEXT_PARAM_PERSISTENCE:
if (args->size)
ret = -EINVAL;
ret = proto_context_set_persistence(fpriv->dev_priv, pc,
args->value);
else
ret = proto_context_set_persistence(fpriv->dev_priv, pc,
args->value);
break;
case I915_CONTEXT_PARAM_PROTECTED_CONTENT:
......@@ -1367,7 +1368,8 @@ static struct intel_engine_cs *active_engine(struct intel_context *ce)
return engine;
}
static void kill_engines(struct i915_gem_engines *engines, bool ban)
static void
kill_engines(struct i915_gem_engines *engines, bool exit, bool persistent)
{
struct i915_gem_engines_iter it;
struct intel_context *ce;
......@@ -1381,9 +1383,15 @@ static void kill_engines(struct i915_gem_engines *engines, bool ban)
*/
for_each_gem_engine(ce, engines, it) {
struct intel_engine_cs *engine;
bool skip = false;
if (ban && intel_context_ban(ce, NULL))
continue;
if (exit)
skip = intel_context_set_exiting(ce);
else if (!persistent)
skip = intel_context_exit_nonpersistent(ce, NULL);
if (skip)
continue; /* Already marked. */
/*
* Check the current active state of this context; if we
......@@ -1395,7 +1403,7 @@ static void kill_engines(struct i915_gem_engines *engines, bool ban)
engine = active_engine(ce);
/* First attempt to gracefully cancel the context */
if (engine && !__cancel_engine(engine) && ban)
if (engine && !__cancel_engine(engine) && (exit || !persistent))
/*
* If we are unable to send a preemptive pulse to bump
* the context from the GPU, we have to resort to a full
......@@ -1407,8 +1415,6 @@ static void kill_engines(struct i915_gem_engines *engines, bool ban)
static void kill_context(struct i915_gem_context *ctx)
{
bool ban = (!i915_gem_context_is_persistent(ctx) ||
!ctx->i915->params.enable_hangcheck);
struct i915_gem_engines *pos, *next;
spin_lock_irq(&ctx->stale.lock);
......@@ -1421,7 +1427,8 @@ static void kill_context(struct i915_gem_context *ctx)
spin_unlock_irq(&ctx->stale.lock);
kill_engines(pos, ban);
kill_engines(pos, !ctx->i915->params.enable_hangcheck,
i915_gem_context_is_persistent(ctx));
spin_lock_irq(&ctx->stale.lock);
GEM_BUG_ON(i915_sw_fence_signaled(&pos->fence));
......@@ -1467,7 +1474,8 @@ static void engines_idle_release(struct i915_gem_context *ctx,
kill:
if (list_empty(&engines->link)) /* raced, already closed */
kill_engines(engines, true);
kill_engines(engines, true,
i915_gem_context_is_persistent(ctx));
i915_sw_fence_commit(&engines->fence);
}
......@@ -1875,6 +1883,7 @@ i915_gem_user_to_context_sseu(struct intel_gt *gt,
{
const struct sseu_dev_info *device = &gt->info.sseu;
struct drm_i915_private *i915 = gt->i915;
unsigned int dev_subslice_mask = intel_sseu_get_hsw_subslices(device, 0);
/* No zeros in any field. */
if (!user->slice_mask || !user->subslice_mask ||
......@@ -1901,7 +1910,7 @@ i915_gem_user_to_context_sseu(struct intel_gt *gt,
if (user->slice_mask & ~device->slice_mask)
return -EINVAL;
if (user->subslice_mask & ~device->subslice_mask[0])
if (user->subslice_mask & ~dev_subslice_mask)
return -EINVAL;
if (user->max_eus_per_subslice > device->max_eus_per_subslice)
......@@ -1915,7 +1924,7 @@ i915_gem_user_to_context_sseu(struct intel_gt *gt,
/* Part specific restrictions. */
if (GRAPHICS_VER(i915) == 11) {
unsigned int hw_s = hweight8(device->slice_mask);
unsigned int hw_ss_per_s = hweight8(device->subslice_mask[0]);
unsigned int hw_ss_per_s = hweight8(dev_subslice_mask);
unsigned int req_s = hweight8(context->slice_mask);
unsigned int req_ss = hweight8(context->subslice_mask);
......
......@@ -35,12 +35,12 @@ bool i915_gem_cpu_write_needs_clflush(struct drm_i915_gem_object *obj)
if (obj->cache_dirty)
return false;
if (!(obj->cache_coherent & I915_BO_CACHE_COHERENT_FOR_WRITE))
return true;
if (IS_DGFX(i915))
return false;
if (!(obj->cache_coherent & I915_BO_CACHE_COHERENT_FOR_WRITE))
return true;
/* Currently in use by HW (display engine)? Keep flushed. */
return i915_gem_object_is_framebuffer(obj);
}
......
......@@ -999,7 +999,8 @@ static int eb_validate_vmas(struct i915_execbuffer *eb)
}
}
err = dma_resv_reserve_fences(vma->obj->base.resv, 1);
/* Reserve enough slots to accommodate composite fences */
err = dma_resv_reserve_fences(vma->obj->base.resv, eb->num_batches);
if (err)
return err;
......
......@@ -670,17 +670,10 @@ i915_gem_object_create_shmem_from_data(struct drm_i915_private *dev_priv,
static int init_shmem(struct intel_memory_region *mem)
{
int err;
err = i915_gemfs_init(mem->i915);
if (err) {
DRM_NOTE("Unable to create a private tmpfs mount, hugepage support will be disabled(%d).\n",
err);
}
i915_gemfs_init(mem->i915);
intel_memory_region_set_name(mem, "system");
return 0; /* Don't error, we can simply fallback to the kernel mnt */
return 0; /* We have fallback to the kernel mnt if gemfs init failed. */
}
static int release_shmem(struct intel_memory_region *mem)
......
......@@ -36,7 +36,7 @@ static bool can_release_pages(struct drm_i915_gem_object *obj)
return swap_available() || obj->mm.madv == I915_MADV_DONTNEED;
}
static int drop_pages(struct drm_i915_gem_object *obj,
static bool drop_pages(struct drm_i915_gem_object *obj,
unsigned long shrink, bool trylock_vm)
{
unsigned long flags;
......
......@@ -13,6 +13,8 @@
#include "gem/i915_gem_lmem.h"
#include "gem/i915_gem_region.h"
#include "gt/intel_gt.h"
#include "gt/intel_gt_mcr.h"
#include "gt/intel_gt_regs.h"
#include "gt/intel_region_lmem.h"
#include "i915_drv.h"
#include "i915_gem_stolen.h"
......@@ -834,8 +836,8 @@ i915_gem_stolen_lmem_setup(struct drm_i915_private *i915, u16 type,
} else {
resource_size_t lmem_range;
lmem_range = intel_gt_read_register(&i915->gt0, XEHPSDV_TILE0_ADDR_RANGE) & 0xFFFF;
lmem_size = lmem_range >> XEHPSDV_TILE_LMEM_RANGE_SHIFT;
lmem_range = intel_gt_mcr_read_any(&i915->gt0, XEHP_TILE0_ADDR_RANGE) & 0xFFFF;
lmem_size = lmem_range >> XEHP_TILE_LMEM_RANGE_SHIFT;
lmem_size *= SZ_1G;
}
......
......@@ -114,7 +114,7 @@ u32 i915_gem_fence_alignment(struct drm_i915_private *i915, u32 size,
return i915_gem_fence_size(i915, size, tiling, stride);
}
/* Check pitch constriants for all chips & tiling formats */
/* Check pitch constraints for all chips & tiling formats */
static bool
i915_tiling_ok(struct drm_i915_gem_object *obj,
unsigned int tiling, unsigned int stride)
......
......@@ -11,16 +11,11 @@
#include "i915_gemfs.h"
#include "i915_utils.h"
int i915_gemfs_init(struct drm_i915_private *i915)
void i915_gemfs_init(struct drm_i915_private *i915)
{
char huge_opt[] = "huge=within_size"; /* r/w */
struct file_system_type *type;
struct vfsmount *gemfs;
char *opts;
type = get_fs_type("tmpfs");
if (!type)
return -ENODEV;
/*
* By creating our own shmemfs mountpoint, we can pass in
......@@ -28,30 +23,35 @@ int i915_gemfs_init(struct drm_i915_private *i915)
*
* One example, although it is probably better with a per-file
* control, is selecting huge page allocations ("huge=within_size").
* However, we only do so to offset the overhead of iommu lookups
* due to bandwidth issues (slow reads) on Broadwell+.
* However, we only do so on platforms which benefit from it, or to
* offset the overhead of iommu lookups, where with latter it is a net
* win even on platforms which would otherwise see some performance
* regressions such a slow reads issue on Broadwell and Skylake.
*/
opts = NULL;
if (i915_vtd_active(i915)) {
if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE)) {
opts = huge_opt;
drm_info(&i915->drm,
"Transparent Hugepage mode '%s'\n",
opts);
} else {
drm_notice(&i915->drm,
"Transparent Hugepage support is recommended for optimal performance when IOMMU is enabled!\n");
}
}
gemfs = vfs_kern_mount(type, SB_KERNMOUNT, type->name, opts);
if (GRAPHICS_VER(i915) < 11 && !i915_vtd_active(i915))
return;
if (!IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE))
goto err;
type = get_fs_type("tmpfs");
if (!type)
goto err;
gemfs = vfs_kern_mount(type, SB_KERNMOUNT, type->name, huge_opt);
if (IS_ERR(gemfs))
return PTR_ERR(gemfs);
goto err;
i915->mm.gemfs = gemfs;
return 0;
drm_info(&i915->drm, "Using Transparent Hugepages\n");
return;
err:
drm_notice(&i915->drm,
"Transparent Hugepage support is recommended for optimal performance%s\n",
GRAPHICS_VER(i915) >= 11 ? " on this platform!" :
" when IOMMU is enabled!");
}
void i915_gemfs_fini(struct drm_i915_private *i915)
......
......@@ -9,8 +9,7 @@
struct drm_i915_private;
int i915_gemfs_init(struct drm_i915_private *i915);
void i915_gemfs_init(struct drm_i915_private *i915);
void i915_gemfs_fini(struct drm_i915_private *i915);
#endif
......@@ -6,6 +6,7 @@
#include "i915_selftest.h"
#include "gt/intel_context.h"
#include "gt/intel_engine_regs.h"
#include "gt/intel_engine_user.h"
#include "gt/intel_gpu_commands.h"
#include "gt/intel_gt.h"
......@@ -18,10 +19,71 @@
#include "huge_gem_object.h"
#include "mock_context.h"
#define OW_SIZE 16 /* in bytes */
#define F_SUBTILE_SIZE 64 /* in bytes */
#define F_TILE_WIDTH 128 /* in bytes */
#define F_TILE_HEIGHT 32 /* in pixels */
#define F_SUBTILE_WIDTH OW_SIZE /* in bytes */
#define F_SUBTILE_HEIGHT 4 /* in pixels */
static int linear_x_y_to_ftiled_pos(int x, int y, u32 stride, int bpp)
{
int tile_base;
int tile_x, tile_y;
int swizzle, subtile;
int pixel_size = bpp / 8;
int pos;
/*
* Subtile remapping for F tile. Note that map[a]==b implies map[b]==a
* so we can use the same table to tile and until.
*/
static const u8 f_subtile_map[] = {
0, 1, 2, 3, 8, 9, 10, 11,
4, 5, 6, 7, 12, 13, 14, 15,
16, 17, 18, 19, 24, 25, 26, 27,
20, 21, 22, 23, 28, 29, 30, 31,
32, 33, 34, 35, 40, 41, 42, 43,
36, 37, 38, 39, 44, 45, 46, 47,
48, 49, 50, 51, 56, 57, 58, 59,
52, 53, 54, 55, 60, 61, 62, 63
};
x *= pixel_size;
/*
* Where does the 4k tile start (in bytes)? This is the same for Y and
* F so we can use the Y-tile algorithm to get to that point.
*/
tile_base =
y / F_TILE_HEIGHT * stride * F_TILE_HEIGHT +
x / F_TILE_WIDTH * 4096;
/* Find pixel within tile */
tile_x = x % F_TILE_WIDTH;
tile_y = y % F_TILE_HEIGHT;
/* And figure out the subtile within the 4k tile */
subtile = tile_y / F_SUBTILE_HEIGHT * 8 + tile_x / F_SUBTILE_WIDTH;
/* Swizzle the subtile number according to the bspec diagram */
swizzle = f_subtile_map[subtile];
/* Calculate new position */
pos = tile_base +
swizzle * F_SUBTILE_SIZE +
tile_y % F_SUBTILE_HEIGHT * OW_SIZE +
tile_x % F_SUBTILE_WIDTH;
GEM_BUG_ON(!IS_ALIGNED(pos, pixel_size));
return pos / pixel_size * 4;
}
enum client_tiling {
CLIENT_TILING_LINEAR,
CLIENT_TILING_X,
CLIENT_TILING_Y,
CLIENT_TILING_4,
CLIENT_NUM_TILING_TYPES
};
......@@ -45,6 +107,36 @@ struct tiled_blits {
u32 height;
};
static bool supports_x_tiling(const struct drm_i915_private *i915)
{
int gen = GRAPHICS_VER(i915);
if (gen < 12)
return true;
if (!HAS_LMEM(i915) || IS_DG1(i915))
return false;
return true;
}
static bool fast_blit_ok(const struct blit_buffer *buf)
{
int gen = GRAPHICS_VER(buf->vma->vm->i915);
if (gen < 9)
return false;
if (gen < 12)
return true;
/* filter out platforms with unsupported X-tile support in fastblit */
if (buf->tiling == CLIENT_TILING_X && !supports_x_tiling(buf->vma->vm->i915))
return false;
return true;
}
static int prepare_blit(const struct tiled_blits *t,
struct blit_buffer *dst,
struct blit_buffer *src,
......@@ -59,51 +151,103 @@ static int prepare_blit(const struct tiled_blits *t,
if (IS_ERR(cs))
return PTR_ERR(cs);
*cs++ = MI_LOAD_REGISTER_IMM(1);
*cs++ = i915_mmio_reg_offset(BCS_SWCTRL);
cmd = (BCS_SRC_Y | BCS_DST_Y) << 16;
if (src->tiling == CLIENT_TILING_Y)
cmd |= BCS_SRC_Y;
if (dst->tiling == CLIENT_TILING_Y)
cmd |= BCS_DST_Y;
*cs++ = cmd;
cmd = MI_FLUSH_DW;
if (ver >= 8)
cmd++;
*cs++ = cmd;
*cs++ = 0;
*cs++ = 0;
*cs++ = 0;
cmd = XY_SRC_COPY_BLT_CMD | BLT_WRITE_RGBA | (8 - 2);
if (ver >= 8)
cmd += 2;
src_pitch = t->width * 4;
if (src->tiling) {
cmd |= XY_SRC_COPY_BLT_SRC_TILED;
src_pitch /= 4;
}
if (fast_blit_ok(dst) && fast_blit_ok(src)) {
struct intel_gt *gt = t->ce->engine->gt;
u32 src_tiles = 0, dst_tiles = 0;
u32 src_4t = 0, dst_4t = 0;
/* Need to program BLIT_CCTL if it is not done previously
* before using XY_FAST_COPY_BLT
*/
*cs++ = MI_LOAD_REGISTER_IMM(1);
*cs++ = i915_mmio_reg_offset(BLIT_CCTL(t->ce->engine->mmio_base));
*cs++ = (BLIT_CCTL_SRC_MOCS(gt->mocs.uc_index) |
BLIT_CCTL_DST_MOCS(gt->mocs.uc_index));
src_pitch = t->width; /* in dwords */
if (src->tiling == CLIENT_TILING_4) {
src_tiles = XY_FAST_COPY_BLT_D0_SRC_TILE_MODE(YMAJOR);
src_4t = XY_FAST_COPY_BLT_D1_SRC_TILE4;
} else if (src->tiling == CLIENT_TILING_Y) {
src_tiles = XY_FAST_COPY_BLT_D0_SRC_TILE_MODE(YMAJOR);
} else if (src->tiling == CLIENT_TILING_X) {
src_tiles = XY_FAST_COPY_BLT_D0_SRC_TILE_MODE(TILE_X);
} else {
src_pitch *= 4; /* in bytes */
}
dst_pitch = t->width * 4;
if (dst->tiling) {
cmd |= XY_SRC_COPY_BLT_DST_TILED;
dst_pitch /= 4;
}
dst_pitch = t->width; /* in dwords */
if (dst->tiling == CLIENT_TILING_4) {
dst_tiles = XY_FAST_COPY_BLT_D0_DST_TILE_MODE(YMAJOR);
dst_4t = XY_FAST_COPY_BLT_D1_DST_TILE4;
} else if (dst->tiling == CLIENT_TILING_Y) {
dst_tiles = XY_FAST_COPY_BLT_D0_DST_TILE_MODE(YMAJOR);
} else if (dst->tiling == CLIENT_TILING_X) {
dst_tiles = XY_FAST_COPY_BLT_D0_DST_TILE_MODE(TILE_X);
} else {
dst_pitch *= 4; /* in bytes */
}
*cs++ = cmd;
*cs++ = BLT_DEPTH_32 | BLT_ROP_SRC_COPY | dst_pitch;
*cs++ = 0;
*cs++ = t->height << 16 | t->width;
*cs++ = lower_32_bits(dst->vma->node.start);
if (use_64b_reloc)
*cs++ = GEN9_XY_FAST_COPY_BLT_CMD | (10 - 2) |
src_tiles | dst_tiles;
*cs++ = src_4t | dst_4t | BLT_DEPTH_32 | dst_pitch;
*cs++ = 0;
*cs++ = t->height << 16 | t->width;
*cs++ = lower_32_bits(dst->vma->node.start);
*cs++ = upper_32_bits(dst->vma->node.start);
*cs++ = 0;
*cs++ = src_pitch;
*cs++ = lower_32_bits(src->vma->node.start);
if (use_64b_reloc)
*cs++ = 0;
*cs++ = src_pitch;
*cs++ = lower_32_bits(src->vma->node.start);
*cs++ = upper_32_bits(src->vma->node.start);
} else {
if (ver >= 6) {
*cs++ = MI_LOAD_REGISTER_IMM(1);
*cs++ = i915_mmio_reg_offset(BCS_SWCTRL);
cmd = (BCS_SRC_Y | BCS_DST_Y) << 16;
if (src->tiling == CLIENT_TILING_Y)
cmd |= BCS_SRC_Y;
if (dst->tiling == CLIENT_TILING_Y)
cmd |= BCS_DST_Y;
*cs++ = cmd;
cmd = MI_FLUSH_DW;
if (ver >= 8)
cmd++;
*cs++ = cmd;
*cs++ = 0;
*cs++ = 0;
*cs++ = 0;
}
cmd = XY_SRC_COPY_BLT_CMD | BLT_WRITE_RGBA | (8 - 2);
if (ver >= 8)
cmd += 2;
src_pitch = t->width * 4;
if (src->tiling) {
cmd |= XY_SRC_COPY_BLT_SRC_TILED;
src_pitch /= 4;
}
dst_pitch = t->width * 4;
if (dst->tiling) {
cmd |= XY_SRC_COPY_BLT_DST_TILED;
dst_pitch /= 4;
}
*cs++ = cmd;
*cs++ = BLT_DEPTH_32 | BLT_ROP_SRC_COPY | dst_pitch;
*cs++ = 0;
*cs++ = t->height << 16 | t->width;
*cs++ = lower_32_bits(dst->vma->node.start);
if (use_64b_reloc)
*cs++ = upper_32_bits(dst->vma->node.start);
*cs++ = 0;
*cs++ = src_pitch;
*cs++ = lower_32_bits(src->vma->node.start);
if (use_64b_reloc)
*cs++ = upper_32_bits(src->vma->node.start);
}
*cs++ = MI_BATCH_BUFFER_END;
......@@ -181,7 +325,13 @@ static int tiled_blits_create_buffers(struct tiled_blits *t,
t->buffers[i].vma = vma;
t->buffers[i].tiling =
i915_prandom_u32_max_state(CLIENT_TILING_Y + 1, prng);
i915_prandom_u32_max_state(CLIENT_NUM_TILING_TYPES, prng);
/* Platforms support either TileY or Tile4, not both */
if (HAS_4TILE(i915) && t->buffers[i].tiling == CLIENT_TILING_Y)
t->buffers[i].tiling = CLIENT_TILING_4;
else if (!HAS_4TILE(i915) && t->buffers[i].tiling == CLIENT_TILING_4)
t->buffers[i].tiling = CLIENT_TILING_Y;
}
return 0;
......@@ -206,7 +356,8 @@ static u64 swizzle_bit(unsigned int bit, u64 offset)
static u64 tiled_offset(const struct intel_gt *gt,
u64 v,
unsigned int stride,
enum client_tiling tiling)
enum client_tiling tiling,
int x_pos, int y_pos)
{
unsigned int swizzle;
u64 x, y;
......@@ -216,7 +367,12 @@ static u64 tiled_offset(const struct intel_gt *gt,
y = div64_u64_rem(v, stride, &x);
if (tiling == CLIENT_TILING_X) {
if (tiling == CLIENT_TILING_4) {
v = linear_x_y_to_ftiled_pos(x_pos, y_pos, stride, 32);
/* no swizzling for f-tiling */
swizzle = I915_BIT_6_SWIZZLE_NONE;
} else if (tiling == CLIENT_TILING_X) {
v = div64_u64_rem(y, 8, &y) * stride * 8;
v += y * 512;
v += div64_u64_rem(x, 512, &x) << 12;
......@@ -259,6 +415,7 @@ static const char *repr_tiling(enum client_tiling tiling)
case CLIENT_TILING_LINEAR: return "linear";
case CLIENT_TILING_X: return "X";
case CLIENT_TILING_Y: return "Y";
case CLIENT_TILING_4: return "F";
default: return "unknown";
}
}
......@@ -284,7 +441,7 @@ static int verify_buffer(const struct tiled_blits *t,
} else {
u64 v = tiled_offset(buf->vma->vm->gt,
p * 4, t->width * 4,
buf->tiling);
buf->tiling, x, y);
if (vaddr[v / sizeof(*vaddr)] != buf->start_val + p)
ret = -EINVAL;
......@@ -504,6 +661,9 @@ static int tiled_blits_bounce(struct tiled_blits *t, struct rnd_state *prng)
if (err)
return err;
/* Simulating GTT eviction of the same buffer / layout */
t->buffers[2].tiling = t->buffers[0].tiling;
/* Reposition so that we overlap the old addresses, and slightly off */
err = tiled_blit(t,
&t->buffers[2], t->hole + t->align,
......
......@@ -212,7 +212,7 @@ static int __live_parallel_switch1(void *data)
i915_request_add(rq);
}
if (i915_request_wait(rq, 0, HZ / 5) < 0)
if (i915_request_wait(rq, 0, HZ) < 0)
err = -ETIME;
i915_request_put(rq);
if (err)
......
......@@ -197,8 +197,10 @@ int gen12_emit_flush_rcs(struct i915_request *rq, u32 mode)
flags |= PIPE_CONTROL_CS_STALL;
if (engine->class == COMPUTE_CLASS)
flags &= ~PIPE_CONTROL_3D_FLAGS;
if (!HAS_3D_PIPELINE(engine->i915))
flags &= ~PIPE_CONTROL_3D_ARCH_FLAGS;
else if (engine->class == COMPUTE_CLASS)
flags &= ~PIPE_CONTROL_3D_ENGINE_FLAGS;
cs = intel_ring_begin(rq, 6);
if (IS_ERR(cs))
......@@ -227,8 +229,10 @@ int gen12_emit_flush_rcs(struct i915_request *rq, u32 mode)
flags |= PIPE_CONTROL_CS_STALL;
if (engine->class == COMPUTE_CLASS)
flags &= ~PIPE_CONTROL_3D_FLAGS;
if (!HAS_3D_PIPELINE(engine->i915))
flags &= ~PIPE_CONTROL_3D_ARCH_FLAGS;
else if (engine->class == COMPUTE_CLASS)
flags &= ~PIPE_CONTROL_3D_ENGINE_FLAGS;
if (!HAS_FLAT_CCS(rq->engine->i915))
count = 8 + 4;
......@@ -272,7 +276,8 @@ int gen12_emit_flush_xcs(struct i915_request *rq, u32 mode)
if (!HAS_FLAT_CCS(rq->engine->i915) &&
(rq->engine->class == VIDEO_DECODE_CLASS ||
rq->engine->class == VIDEO_ENHANCEMENT_CLASS)) {
aux_inv = rq->engine->mask & ~BIT(BCS0);
aux_inv = rq->engine->mask &
~GENMASK(_BCS(I915_MAX_BCS - 1), BCS0);
if (aux_inv)
cmd += 4;
}
......@@ -716,8 +721,10 @@ u32 *gen12_emit_fini_breadcrumb_rcs(struct i915_request *rq, u32 *cs)
/* Wa_1409600907 */
flags |= PIPE_CONTROL_DEPTH_STALL;
if (rq->engine->class == COMPUTE_CLASS)
flags &= ~PIPE_CONTROL_3D_FLAGS;
if (!HAS_3D_PIPELINE(rq->engine->i915))
flags &= ~PIPE_CONTROL_3D_ARCH_FLAGS;
else if (rq->engine->class == COMPUTE_CLASS)
flags &= ~PIPE_CONTROL_3D_ENGINE_FLAGS;
cs = gen12_emit_ggtt_write_rcs(cs,
rq->fence.seqno,
......
......@@ -601,6 +601,30 @@ u64 intel_context_get_avg_runtime_ns(struct intel_context *ce)
return avg;
}
bool intel_context_ban(struct intel_context *ce, struct i915_request *rq)
{
bool ret = intel_context_set_banned(ce);
trace_intel_context_ban(ce);
if (ce->ops->revoke)
ce->ops->revoke(ce, rq,
INTEL_CONTEXT_BANNED_PREEMPT_TIMEOUT_MS);
return ret;
}
bool intel_context_exit_nonpersistent(struct intel_context *ce,
struct i915_request *rq)
{
bool ret = intel_context_set_exiting(ce);
if (ce->ops->revoke)
ce->ops->revoke(ce, rq, ce->engine->props.preempt_timeout_ms);
return ret;
}
#if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
#include "selftest_context.c"
#endif
......@@ -25,6 +25,8 @@
##__VA_ARGS__); \
} while (0)
#define INTEL_CONTEXT_BANNED_PREEMPT_TIMEOUT_MS (1)
struct i915_gem_ww_ctx;
void intel_context_init(struct intel_context *ce,
......@@ -309,18 +311,27 @@ static inline bool intel_context_set_banned(struct intel_context *ce)
return test_and_set_bit(CONTEXT_BANNED, &ce->flags);
}
static inline bool intel_context_ban(struct intel_context *ce,
struct i915_request *rq)
bool intel_context_ban(struct intel_context *ce, struct i915_request *rq);
static inline bool intel_context_is_schedulable(const struct intel_context *ce)
{
bool ret = intel_context_set_banned(ce);
return !test_bit(CONTEXT_EXITING, &ce->flags) &&
!test_bit(CONTEXT_BANNED, &ce->flags);
}
trace_intel_context_ban(ce);
if (ce->ops->ban)
ce->ops->ban(ce, rq);
static inline bool intel_context_is_exiting(const struct intel_context *ce)
{
return test_bit(CONTEXT_EXITING, &ce->flags);
}
return ret;
static inline bool intel_context_set_exiting(struct intel_context *ce)
{
return test_and_set_bit(CONTEXT_EXITING, &ce->flags);
}
bool intel_context_exit_nonpersistent(struct intel_context *ce,
struct i915_request *rq);
static inline bool
intel_context_force_single_submission(const struct intel_context *ce)
{
......
......@@ -40,7 +40,8 @@ struct intel_context_ops {
int (*alloc)(struct intel_context *ce);
void (*ban)(struct intel_context *ce, struct i915_request *rq);
void (*revoke)(struct intel_context *ce, struct i915_request *rq,
unsigned int preempt_timeout_ms);
int (*pre_pin)(struct intel_context *ce, struct i915_gem_ww_ctx *ww, void **vaddr);
int (*pin)(struct intel_context *ce, void *vaddr);
......@@ -122,6 +123,7 @@ struct intel_context {
#define CONTEXT_GUC_INIT 10
#define CONTEXT_PERMA_PIN 11
#define CONTEXT_IS_PARKING 12
#define CONTEXT_EXITING 13
struct {
u64 timeout_us;
......
......@@ -201,6 +201,8 @@ int intel_ring_submission_setup(struct intel_engine_cs *engine);
int intel_engine_stop_cs(struct intel_engine_cs *engine);
void intel_engine_cancel_stop_cs(struct intel_engine_cs *engine);
void intel_engine_wait_for_pending_mi_fw(struct intel_engine_cs *engine);
void intel_engine_set_hwsp_writemask(struct intel_engine_cs *engine, u32 mask);
u64 intel_engine_get_active_head(const struct intel_engine_cs *engine);
......
......@@ -21,8 +21,9 @@
#include "intel_engine_user.h"
#include "intel_execlists_submission.h"
#include "intel_gt.h"
#include "intel_gt_requests.h"
#include "intel_gt_mcr.h"
#include "intel_gt_pm.h"
#include "intel_gt_requests.h"
#include "intel_lrc.h"
#include "intel_lrc_reg.h"
#include "intel_reset.h"
......@@ -71,6 +72,62 @@ static const struct engine_info intel_engines[] = {
{ .graphics_ver = 6, .base = BLT_RING_BASE }
},
},
[BCS1] = {
.class = COPY_ENGINE_CLASS,
.instance = 1,
.mmio_bases = {
{ .graphics_ver = 12, .base = XEHPC_BCS1_RING_BASE }
},
},
[BCS2] = {
.class = COPY_ENGINE_CLASS,
.instance = 2,
.mmio_bases = {
{ .graphics_ver = 12, .base = XEHPC_BCS2_RING_BASE }
},
},
[BCS3] = {
.class = COPY_ENGINE_CLASS,
.instance = 3,
.mmio_bases = {
{ .graphics_ver = 12, .base = XEHPC_BCS3_RING_BASE }
},
},
[BCS4] = {
.class = COPY_ENGINE_CLASS,
.instance = 4,
.mmio_bases = {
{ .graphics_ver = 12, .base = XEHPC_BCS4_RING_BASE }
},
},
[BCS5] = {
.class = COPY_ENGINE_CLASS,
.instance = 5,
.mmio_bases = {
{ .graphics_ver = 12, .base = XEHPC_BCS5_RING_BASE }
},
},
[BCS6] = {
.class = COPY_ENGINE_CLASS,
.instance = 6,
.mmio_bases = {
{ .graphics_ver = 12, .base = XEHPC_BCS6_RING_BASE }
},
},
[BCS7] = {
.class = COPY_ENGINE_CLASS,
.instance = 7,
.mmio_bases = {
{ .graphics_ver = 12, .base = XEHPC_BCS7_RING_BASE }
},
},
[BCS8] = {
.class = COPY_ENGINE_CLASS,
.instance = 8,
.mmio_bases = {
{ .graphics_ver = 12, .base = XEHPC_BCS8_RING_BASE }
},
},
[VCS0] = {
.class = VIDEO_DECODE_CLASS,
.instance = 0,
......@@ -334,6 +391,14 @@ static u32 get_reset_domain(u8 ver, enum intel_engine_id id)
static const u32 engine_reset_domains[] = {
[RCS0] = GEN11_GRDOM_RENDER,
[BCS0] = GEN11_GRDOM_BLT,
[BCS1] = XEHPC_GRDOM_BLT1,
[BCS2] = XEHPC_GRDOM_BLT2,
[BCS3] = XEHPC_GRDOM_BLT3,
[BCS4] = XEHPC_GRDOM_BLT4,
[BCS5] = XEHPC_GRDOM_BLT5,
[BCS6] = XEHPC_GRDOM_BLT6,
[BCS7] = XEHPC_GRDOM_BLT7,
[BCS8] = XEHPC_GRDOM_BLT8,
[VCS0] = GEN11_GRDOM_MEDIA,
[VCS1] = GEN11_GRDOM_MEDIA2,
[VCS2] = GEN11_GRDOM_MEDIA3,
......@@ -610,8 +675,8 @@ static void engine_mask_apply_compute_fuses(struct intel_gt *gt)
if (GRAPHICS_VER_FULL(i915) < IP_VER(12, 50))
return;
ccs_mask = intel_slicemask_from_dssmask(intel_sseu_get_compute_subslices(&info->sseu),
ss_per_ccs);
ccs_mask = intel_slicemask_from_xehp_dssmask(info->sseu.compute_subslice_mask,
ss_per_ccs);
/*
* If all DSS in a quadrant are fused off, the corresponding CCS
* engine is not available for use.
......@@ -622,6 +687,34 @@ static void engine_mask_apply_compute_fuses(struct intel_gt *gt)
}
}
static void engine_mask_apply_copy_fuses(struct intel_gt *gt)
{
struct drm_i915_private *i915 = gt->i915;
struct intel_gt_info *info = &gt->info;
unsigned long meml3_mask;
unsigned long quad;
meml3_mask = intel_uncore_read(gt->uncore, GEN10_MIRROR_FUSE3);
meml3_mask = REG_FIELD_GET(GEN12_MEML3_EN_MASK, meml3_mask);
/*
* Link Copy engines may be fused off according to meml3_mask. Each
* bit is a quad that houses 2 Link Copy and two Sub Copy engines.
*/
for_each_clear_bit(quad, &meml3_mask, GEN12_MAX_MSLICES) {
unsigned int instance = quad * 2 + 1;
intel_engine_mask_t mask = GENMASK(_BCS(instance + 1),
_BCS(instance));
if (mask & info->engine_mask) {
drm_dbg(&i915->drm, "bcs%u fused off\n", instance);
drm_dbg(&i915->drm, "bcs%u fused off\n", instance + 1);
info->engine_mask &= ~mask;
}
}
}
/*
* Determine which engines are fused off in our particular hardware.
* Note that we have a catch-22 situation where we need to be able to access
......@@ -704,6 +797,7 @@ static intel_engine_mask_t init_engine_mask(struct intel_gt *gt)
GEM_BUG_ON(vebox_mask != VEBOX_MASK(gt));
engine_mask_apply_compute_fuses(gt);
engine_mask_apply_copy_fuses(gt);
return info->engine_mask;
}
......@@ -1282,10 +1376,10 @@ static int __intel_engine_stop_cs(struct intel_engine_cs *engine,
intel_uncore_write_fw(uncore, mode, _MASKED_BIT_ENABLE(STOP_RING));
/*
* Wa_22011802037 : gen12, Prior to doing a reset, ensure CS is
* Wa_22011802037 : gen11, gen12, Prior to doing a reset, ensure CS is
* stopped, set ring stop bit and prefetch disable bit to halt CS
*/
if (GRAPHICS_VER(engine->i915) == 12)
if (IS_GRAPHICS_VER(engine->i915, 11, 12))
intel_uncore_write_fw(uncore, RING_MODE_GEN7(engine->mmio_base),
_MASKED_BIT_ENABLE(GEN12_GFX_PREFETCH_DISABLE));
......@@ -1308,6 +1402,18 @@ int intel_engine_stop_cs(struct intel_engine_cs *engine)
return -ENODEV;
ENGINE_TRACE(engine, "\n");
/*
* TODO: Find out why occasionally stopping the CS times out. Seen
* especially with gem_eio tests.
*
* Occasionally trying to stop the cs times out, but does not adversely
* affect functionality. The timeout is set as a config parameter that
* defaults to 100ms. In most cases the follow up operation is to wait
* for pending MI_FORCE_WAKES. The assumption is that this timeout is
* sufficient for any pending MI_FORCEWAKEs to complete. Once root
* caused, the caller must check and handle the return from this
* function.
*/
if (__intel_engine_stop_cs(engine, 1000, stop_timeout(engine))) {
ENGINE_TRACE(engine,
"timed out on STOP_RING -> IDLE; HEAD:%04x, TAIL:%04x\n",
......@@ -1334,12 +1440,76 @@ void intel_engine_cancel_stop_cs(struct intel_engine_cs *engine)
ENGINE_WRITE_FW(engine, RING_MI_MODE, _MASKED_BIT_DISABLE(STOP_RING));
}
static u32
read_subslice_reg(const struct intel_engine_cs *engine,
int slice, int subslice, i915_reg_t reg)
static u32 __cs_pending_mi_force_wakes(struct intel_engine_cs *engine)
{
static const i915_reg_t _reg[I915_NUM_ENGINES] = {
[RCS0] = MSG_IDLE_CS,
[BCS0] = MSG_IDLE_BCS,
[VCS0] = MSG_IDLE_VCS0,
[VCS1] = MSG_IDLE_VCS1,
[VCS2] = MSG_IDLE_VCS2,
[VCS3] = MSG_IDLE_VCS3,
[VCS4] = MSG_IDLE_VCS4,
[VCS5] = MSG_IDLE_VCS5,
[VCS6] = MSG_IDLE_VCS6,
[VCS7] = MSG_IDLE_VCS7,
[VECS0] = MSG_IDLE_VECS0,
[VECS1] = MSG_IDLE_VECS1,
[VECS2] = MSG_IDLE_VECS2,
[VECS3] = MSG_IDLE_VECS3,
[CCS0] = MSG_IDLE_CS,
[CCS1] = MSG_IDLE_CS,
[CCS2] = MSG_IDLE_CS,
[CCS3] = MSG_IDLE_CS,
};
u32 val;
if (!_reg[engine->id].reg) {
drm_err(&engine->i915->drm,
"MSG IDLE undefined for engine id %u\n", engine->id);
return 0;
}
val = intel_uncore_read(engine->uncore, _reg[engine->id]);
/* bits[29:25] & bits[13:9] >> shift */
return (val & (val >> 16) & MSG_IDLE_FW_MASK) >> MSG_IDLE_FW_SHIFT;
}
static void __gpm_wait_for_fw_complete(struct intel_gt *gt, u32 fw_mask)
{
return intel_uncore_read_with_mcr_steering(engine->uncore, reg,
slice, subslice);
int ret;
/* Ensure GPM receives fw up/down after CS is stopped */
udelay(1);
/* Wait for forcewake request to complete in GPM */
ret = __intel_wait_for_register_fw(gt->uncore,
GEN9_PWRGT_DOMAIN_STATUS,
fw_mask, fw_mask, 5000, 0, NULL);
/* Ensure CS receives fw ack from GPM */
udelay(1);
if (ret)
GT_TRACE(gt, "Failed to complete pending forcewake %d\n", ret);
}
/*
* Wa_22011802037:gen12: In addition to stopping the cs, we need to wait for any
* pending MI_FORCE_WAKEUP requests that the CS has initiated to complete. The
* pending status is indicated by bits[13:9] (masked by bits[29:25]) in the
* MSG_IDLE register. There's one MSG_IDLE register per reset domain. Since we
* are concerned only with the gt reset here, we use a logical OR of pending
* forcewakeups from all reset domains and then wait for them to complete by
* querying PWRGT_DOMAIN_STATUS.
*/
void intel_engine_wait_for_pending_mi_fw(struct intel_engine_cs *engine)
{
u32 fw_pending = __cs_pending_mi_force_wakes(engine);
if (fw_pending)
__gpm_wait_for_fw_complete(engine->gt, fw_pending);
}
/* NB: please notice the memset */
......@@ -1375,28 +1545,33 @@ void intel_engine_get_instdone(const struct intel_engine_cs *engine,
if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50)) {
for_each_instdone_gslice_dss_xehp(i915, sseu, iter, slice, subslice) {
instdone->sampler[slice][subslice] =
read_subslice_reg(engine, slice, subslice,
GEN7_SAMPLER_INSTDONE);
intel_gt_mcr_read(engine->gt,
GEN7_SAMPLER_INSTDONE,
slice, subslice);
instdone->row[slice][subslice] =
read_subslice_reg(engine, slice, subslice,
GEN7_ROW_INSTDONE);
intel_gt_mcr_read(engine->gt,
GEN7_ROW_INSTDONE,
slice, subslice);
}
} else {
for_each_instdone_slice_subslice(i915, sseu, slice, subslice) {
instdone->sampler[slice][subslice] =
read_subslice_reg(engine, slice, subslice,
GEN7_SAMPLER_INSTDONE);
intel_gt_mcr_read(engine->gt,
GEN7_SAMPLER_INSTDONE,
slice, subslice);
instdone->row[slice][subslice] =
read_subslice_reg(engine, slice, subslice,
GEN7_ROW_INSTDONE);
intel_gt_mcr_read(engine->gt,
GEN7_ROW_INSTDONE,
slice, subslice);
}
}
if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 55)) {
for_each_instdone_gslice_dss_xehp(i915, sseu, iter, slice, subslice)
instdone->geom_svg[slice][subslice] =
read_subslice_reg(engine, slice, subslice,
XEHPG_INSTDONE_GEOM_SVG);
intel_gt_mcr_read(engine->gt,
XEHPG_INSTDONE_GEOM_SVG,
slice, subslice);
}
} else if (GRAPHICS_VER(i915) >= 7) {
instdone->instdone =
......
......@@ -8,6 +8,7 @@
#include "i915_reg_defs.h"
#define RING_EXCC(base) _MMIO((base) + 0x28)
#define RING_TAIL(base) _MMIO((base) + 0x30)
#define TAIL_ADDR 0x001FFFF8
#define RING_HEAD(base) _MMIO((base) + 0x34)
......@@ -133,6 +134,8 @@
(REG_FIELD_PREP(BLIT_CCTL_DST_MOCS_MASK, (dst) << 1) | \
REG_FIELD_PREP(BLIT_CCTL_SRC_MOCS_MASK, (src) << 1))
#define RING_CSCMDOP(base) _MMIO((base) + 0x20c)
/*
* CMD_CCTL read/write fields take a MOCS value and _not_ a table index.
* The lsb of each can be considered a separate enabling bit for encryption.
......@@ -149,6 +152,7 @@
REG_FIELD_PREP(CMD_CCTL_READ_OVERRIDE_MASK, (read) << 1))
#define RING_PREDICATE_RESULT(base) _MMIO((base) + 0x3b8) /* gen12+ */
#define MI_PREDICATE_RESULT_2(base) _MMIO((base) + 0x3bc)
#define LOWER_SLICE_ENABLED (1 << 0)
#define LOWER_SLICE_DISABLED (0 << 0)
......@@ -172,6 +176,7 @@
#define CTX_CTRL_ENGINE_CTX_SAVE_INHIBIT REG_BIT(2)
#define CTX_CTRL_INHIBIT_SYN_CTX_SWITCH REG_BIT(3)
#define GEN12_CTX_CTRL_OAR_CONTEXT_ENABLE REG_BIT(8)
#define RING_CTX_SR_CTL(base) _MMIO((base) + 0x244)
#define RING_SEMA_WAIT_POLL(base) _MMIO((base) + 0x24c)
#define GEN8_RING_PDP_UDW(base, n) _MMIO((base) + 0x270 + (n) * 8 + 4)
#define GEN8_RING_PDP_LDW(base, n) _MMIO((base) + 0x270 + (n) * 8)
......@@ -196,6 +201,7 @@
#define RING_CTX_TIMESTAMP(base) _MMIO((base) + 0x3a8) /* gen8+ */
#define RING_PREDICATE_RESULT(base) _MMIO((base) + 0x3b8)
#define RING_FORCE_TO_NONPRIV(base, i) _MMIO(((base) + 0x4D0) + (i) * 4)
#define RING_FORCE_TO_NONPRIV_DENY REG_BIT(30)
#define RING_FORCE_TO_NONPRIV_ADDRESS_MASK REG_GENMASK(25, 2)
#define RING_FORCE_TO_NONPRIV_ACCESS_RW (0 << 28) /* CFL+ & Gen11+ */
#define RING_FORCE_TO_NONPRIV_ACCESS_RD (1 << 28)
......@@ -208,7 +214,9 @@
#define RING_FORCE_TO_NONPRIV_RANGE_64 (3 << 0)
#define RING_FORCE_TO_NONPRIV_RANGE_MASK (3 << 0)
#define RING_FORCE_TO_NONPRIV_MASK_VALID \
(RING_FORCE_TO_NONPRIV_RANGE_MASK | RING_FORCE_TO_NONPRIV_ACCESS_MASK)
(RING_FORCE_TO_NONPRIV_RANGE_MASK | \
RING_FORCE_TO_NONPRIV_ACCESS_MASK | \
RING_FORCE_TO_NONPRIV_DENY)
#define RING_MAX_NONPRIV_SLOTS 12
#define RING_EXECLIST_SQ_CONTENTS(base) _MMIO((base) + 0x510)
......
......@@ -35,7 +35,7 @@
#define OTHER_CLASS 4
#define COMPUTE_CLASS 5
#define MAX_ENGINE_CLASS 5
#define MAX_ENGINE_INSTANCE 7
#define MAX_ENGINE_INSTANCE 8
#define I915_MAX_SLICES 3
#define I915_MAX_SUBSLICES 8
......@@ -99,6 +99,7 @@ struct i915_ctx_workarounds {
#define I915_MAX_SFC (I915_MAX_VCS / 2)
#define I915_MAX_CCS 4
#define I915_MAX_RCS 1
#define I915_MAX_BCS 9
/*
* Engine IDs definitions.
......@@ -107,6 +108,15 @@ struct i915_ctx_workarounds {
enum intel_engine_id {
RCS0 = 0,
BCS0,
BCS1,
BCS2,
BCS3,
BCS4,
BCS5,
BCS6,
BCS7,
BCS8,
#define _BCS(n) (BCS0 + (n))
VCS0,
VCS1,
VCS2,
......
......@@ -480,9 +480,9 @@ __execlists_schedule_in(struct i915_request *rq)
if (unlikely(intel_context_is_closed(ce) &&
!intel_engine_has_heartbeat(engine)))
intel_context_set_banned(ce);
intel_context_set_exiting(ce);
if (unlikely(intel_context_is_banned(ce) || bad_request(rq)))
if (unlikely(!intel_context_is_schedulable(ce) || bad_request(rq)))
reset_active(rq, engine);
if (IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM))
......@@ -661,6 +661,16 @@ static inline void execlists_schedule_out(struct i915_request *rq)
i915_request_put(rq);
}
static u32 map_i915_prio_to_lrc_desc_prio(int prio)
{
if (prio > I915_PRIORITY_NORMAL)
return GEN12_CTX_PRIORITY_HIGH;
else if (prio < I915_PRIORITY_NORMAL)
return GEN12_CTX_PRIORITY_LOW;
else
return GEN12_CTX_PRIORITY_NORMAL;
}
static u64 execlists_update_context(struct i915_request *rq)
{
struct intel_context *ce = rq->context;
......@@ -669,7 +679,7 @@ static u64 execlists_update_context(struct i915_request *rq)
desc = ce->lrc.desc;
if (rq->engine->flags & I915_ENGINE_HAS_EU_PRIORITY)
desc |= lrc_desc_priority(rq_prio(rq));
desc |= map_i915_prio_to_lrc_desc_prio(rq_prio(rq));
/*
* WaIdleLiteRestore:bdw,skl
......@@ -1233,7 +1243,7 @@ static unsigned long active_preempt_timeout(struct intel_engine_cs *engine,
/* Force a fast reset for terminated contexts (ignoring sysfs!) */
if (unlikely(intel_context_is_banned(rq->context) || bad_request(rq)))
return 1;
return INTEL_CONTEXT_BANNED_PREEMPT_TIMEOUT_MS;
return READ_ONCE(engine->props.preempt_timeout_ms);
}
......@@ -2958,6 +2968,13 @@ static void execlists_reset_prepare(struct intel_engine_cs *engine)
ring_set_paused(engine, 1);
intel_engine_stop_cs(engine);
/*
* Wa_22011802037:gen11/gen12: In addition to stopping the cs, we need
* to wait for any pending mi force wakeups
*/
if (IS_GRAPHICS_VER(engine->i915, 11, 12))
intel_engine_wait_for_pending_mi_fw(engine);
engine->execlists.reset_ccid = active_ccid(engine);
}
......
This diff is collapsed.
// SPDX-License-Identifier: MIT
/*
* Copyright © 2022 Intel Corporation
*/
#include "intel_ggtt_gmch.h"
#include <drm/intel-gtt.h>
#include <drm/i915_drm.h>
#include <linux/agp_backend.h>
#include "i915_drv.h"
#include "i915_utils.h"
#include "intel_gtt.h"
#include "intel_gt_regs.h"
#include "intel_gt.h"
static void gmch_ggtt_insert_page(struct i915_address_space *vm,
dma_addr_t addr,
u64 offset,
enum i915_cache_level cache_level,
u32 unused)
{
unsigned int flags = (cache_level == I915_CACHE_NONE) ?
AGP_USER_MEMORY : AGP_USER_CACHED_MEMORY;
intel_gmch_gtt_insert_page(addr, offset >> PAGE_SHIFT, flags);
}
static void gmch_ggtt_insert_entries(struct i915_address_space *vm,
struct i915_vma_resource *vma_res,
enum i915_cache_level cache_level,
u32 unused)
{
unsigned int flags = (cache_level == I915_CACHE_NONE) ?
AGP_USER_MEMORY : AGP_USER_CACHED_MEMORY;
intel_gmch_gtt_insert_sg_entries(vma_res->bi.pages, vma_res->start >> PAGE_SHIFT,
flags);
}
static void gmch_ggtt_invalidate(struct i915_ggtt *ggtt)
{
intel_gmch_gtt_flush();
}
static void gmch_ggtt_clear_range(struct i915_address_space *vm,
u64 start, u64 length)
{
intel_gmch_gtt_clear_range(start >> PAGE_SHIFT, length >> PAGE_SHIFT);
}
static void gmch_ggtt_remove(struct i915_address_space *vm)
{
intel_gmch_remove();
}
/*
* Certain Gen5 chipsets require idling the GPU before unmapping anything from
* the GTT when VT-d is enabled.
*/
static bool needs_idle_maps(struct drm_i915_private *i915)
{
/*
* Query intel_iommu to see if we need the workaround. Presumably that
* was loaded first.
*/
if (!i915_vtd_active(i915))
return false;
if (GRAPHICS_VER(i915) == 5 && IS_MOBILE(i915))
return true;
return false;
}
int intel_ggtt_gmch_probe(struct i915_ggtt *ggtt)
{
struct drm_i915_private *i915 = ggtt->vm.i915;
phys_addr_t gmadr_base;
int ret;
ret = intel_gmch_probe(i915->bridge_dev, to_pci_dev(i915->drm.dev), NULL);
if (!ret) {
drm_err(&i915->drm, "failed to set up gmch\n");
return -EIO;
}
intel_gmch_gtt_get(&ggtt->vm.total, &gmadr_base, &ggtt->mappable_end);
ggtt->gmadr =
(struct resource)DEFINE_RES_MEM(gmadr_base, ggtt->mappable_end);
ggtt->vm.alloc_pt_dma = alloc_pt_dma;
ggtt->vm.alloc_scratch_dma = alloc_pt_dma;
if (needs_idle_maps(i915)) {
drm_notice(&i915->drm,
"Flushing DMA requests before IOMMU unmaps; performance may be degraded\n");
ggtt->do_idle_maps = true;
}
ggtt->vm.insert_page = gmch_ggtt_insert_page;
ggtt->vm.insert_entries = gmch_ggtt_insert_entries;
ggtt->vm.clear_range = gmch_ggtt_clear_range;
ggtt->vm.cleanup = gmch_ggtt_remove;
ggtt->invalidate = gmch_ggtt_invalidate;
ggtt->vm.vma_ops.bind_vma = intel_ggtt_bind_vma;
ggtt->vm.vma_ops.unbind_vma = intel_ggtt_unbind_vma;
if (unlikely(ggtt->do_idle_maps))
drm_notice(&i915->drm,
"Applying Ironlake quirks for intel_iommu\n");
return 0;
}
int intel_ggtt_gmch_enable_hw(struct drm_i915_private *i915)
{
if (!intel_gmch_enable_gtt())
return -EIO;
return 0;
}
void intel_ggtt_gmch_flush(void)
{
intel_gmch_gtt_flush();
}
/* SPDX-License-Identifier: MIT */
/*
* Copyright © 2022 Intel Corporation
*/
#ifndef __INTEL_GGTT_GMCH_H__
#define __INTEL_GGTT_GMCH_H__
#include "intel_gtt.h"
/* For x86 platforms */
#if IS_ENABLED(CONFIG_X86)
void intel_ggtt_gmch_flush(void);
int intel_ggtt_gmch_enable_hw(struct drm_i915_private *i915);
int intel_ggtt_gmch_probe(struct i915_ggtt *ggtt);
/* Stubs for non-x86 platforms */
#else
static inline void intel_ggtt_gmch_flush(void) { }
static inline int intel_ggtt_gmch_enable_hw(struct drm_i915_private *i915) { return -ENODEV; }
static inline int intel_ggtt_gmch_probe(struct i915_ggtt *ggtt) { return -ENODEV; }
#endif
#endif /* __INTEL_GGTT_GMCH_H__ */
......@@ -236,6 +236,28 @@
#define XY_FAST_COLOR_BLT_DW 16
#define XY_FAST_COLOR_BLT_MOCS_MASK GENMASK(27, 21)
#define XY_FAST_COLOR_BLT_MEM_TYPE_SHIFT 31
#define XY_FAST_COPY_BLT_D0_SRC_TILING_MASK REG_GENMASK(21, 20)
#define XY_FAST_COPY_BLT_D0_DST_TILING_MASK REG_GENMASK(14, 13)
#define XY_FAST_COPY_BLT_D0_SRC_TILE_MODE(mode) \
REG_FIELD_PREP(XY_FAST_COPY_BLT_D0_SRC_TILING_MASK, mode)
#define XY_FAST_COPY_BLT_D0_DST_TILE_MODE(mode) \
REG_FIELD_PREP(XY_FAST_COPY_BLT_D0_DST_TILING_MASK, mode)
#define LINEAR 0
#define TILE_X 0x1
#define XMAJOR 0x1
#define YMAJOR 0x2
#define TILE_64 0x3
#define XY_FAST_COPY_BLT_D1_SRC_TILE4 REG_BIT(31)
#define XY_FAST_COPY_BLT_D1_DST_TILE4 REG_BIT(30)
#define BLIT_CCTL_SRC_MOCS_MASK REG_GENMASK(6, 0)
#define BLIT_CCTL_DST_MOCS_MASK REG_GENMASK(14, 8)
/* Note: MOCS value = (index << 1) */
#define BLIT_CCTL_SRC_MOCS(idx) \
REG_FIELD_PREP(BLIT_CCTL_SRC_MOCS_MASK, (idx) << 1)
#define BLIT_CCTL_DST_MOCS(idx) \
REG_FIELD_PREP(BLIT_CCTL_DST_MOCS_MASK, (idx) << 1)
#define SRC_COPY_BLT_CMD (2 << 29 | 0x43 << 22)
#define GEN9_XY_FAST_COPY_BLT_CMD (2 << 29 | 0x42 << 22)
#define XY_SRC_COPY_BLT_CMD (2 << 29 | 0x53 << 22)
......@@ -288,8 +310,11 @@
#define PIPE_CONTROL_DEPTH_CACHE_FLUSH (1<<0)
#define PIPE_CONTROL_GLOBAL_GTT (1<<2) /* in addr dword */
/* 3D-related flags can't be set on compute engine */
#define PIPE_CONTROL_3D_FLAGS (\
/*
* 3D-related flags that can't be set on _engines_ that lack access to the 3D
* pipeline (i.e., CCS engines).
*/
#define PIPE_CONTROL_3D_ENGINE_FLAGS (\
PIPE_CONTROL_RENDER_TARGET_CACHE_FLUSH | \
PIPE_CONTROL_DEPTH_CACHE_FLUSH | \
PIPE_CONTROL_TILE_CACHE_FLUSH | \
......@@ -300,6 +325,14 @@
PIPE_CONTROL_VF_CACHE_INVALIDATE | \
PIPE_CONTROL_GLOBAL_SNAPSHOT_RESET)
/* 3D-related flags that can't be set on _platforms_ that lack a 3D pipeline */
#define PIPE_CONTROL_3D_ARCH_FLAGS ( \
PIPE_CONTROL_3D_ENGINE_FLAGS | \
PIPE_CONTROL_INDIRECT_STATE_DISABLE | \
PIPE_CONTROL_FLUSH_ENABLE | \
PIPE_CONTROL_TEXTURE_CACHE_INVALIDATE | \
PIPE_CONTROL_DC_FLUSH_ENABLE)
#define MI_MATH(x) MI_INSTR(0x1a, (x) - 1)
#define MI_MATH_INSTR(opcode, op1, op2) ((opcode) << 20 | (op1) << 10 | (op2))
/* Opcodes for MI_MATH_INSTR */
......
......@@ -4,6 +4,7 @@
*/
#include <drm/drm_managed.h>
#include <drm/intel-gtt.h>
#include "gem/i915_gem_internal.h"
#include "gem/i915_gem_lmem.h"
......@@ -12,11 +13,12 @@
#include "i915_drv.h"
#include "intel_context.h"
#include "intel_engine_regs.h"
#include "intel_ggtt_gmch.h"
#include "intel_gt.h"
#include "intel_gt_buffer_pool.h"
#include "intel_gt_clock_utils.h"
#include "intel_gt_debugfs.h"
#include "intel_gt_gmch.h"
#include "intel_gt_mcr.h"
#include "intel_gt_pm.h"
#include "intel_gt_regs.h"
#include "intel_gt_requests.h"
......@@ -102,78 +104,13 @@ int intel_gt_assign_ggtt(struct intel_gt *gt)
return gt->ggtt ? 0 : -ENOMEM;
}
static const char * const intel_steering_types[] = {
"L3BANK",
"MSLICE",
"LNCF",
};
static const struct intel_mmio_range icl_l3bank_steering_table[] = {
{ 0x00B100, 0x00B3FF },
{},
};
static const struct intel_mmio_range xehpsdv_mslice_steering_table[] = {
{ 0x004000, 0x004AFF },
{ 0x00C800, 0x00CFFF },
{ 0x00DD00, 0x00DDFF },
{ 0x00E900, 0x00FFFF }, /* 0xEA00 - OxEFFF is unused */
{},
};
static const struct intel_mmio_range xehpsdv_lncf_steering_table[] = {
{ 0x00B000, 0x00B0FF },
{ 0x00D800, 0x00D8FF },
{},
};
static const struct intel_mmio_range dg2_lncf_steering_table[] = {
{ 0x00B000, 0x00B0FF },
{ 0x00D880, 0x00D8FF },
{},
};
static u16 slicemask(struct intel_gt *gt, int count)
{
u64 dss_mask = intel_sseu_get_subslices(&gt->info.sseu, 0);
return intel_slicemask_from_dssmask(dss_mask, count);
}
int intel_gt_init_mmio(struct intel_gt *gt)
{
struct drm_i915_private *i915 = gt->i915;
intel_gt_init_clock_frequency(gt);
intel_uc_init_mmio(&gt->uc);
intel_sseu_info_init(gt);
/*
* An mslice is unavailable only if both the meml3 for the slice is
* disabled *and* all of the DSS in the slice (quadrant) are disabled.
*/
if (HAS_MSLICES(i915))
gt->info.mslice_mask =
slicemask(gt, GEN_DSS_PER_MSLICE) |
(intel_uncore_read(gt->uncore, GEN10_MIRROR_FUSE3) &
GEN12_MEML3_EN_MASK);
if (IS_DG2(i915)) {
gt->steering_table[MSLICE] = xehpsdv_mslice_steering_table;
gt->steering_table[LNCF] = dg2_lncf_steering_table;
} else if (IS_XEHPSDV(i915)) {
gt->steering_table[MSLICE] = xehpsdv_mslice_steering_table;
gt->steering_table[LNCF] = xehpsdv_lncf_steering_table;
} else if (GRAPHICS_VER(i915) >= 11 &&
GRAPHICS_VER_FULL(i915) < IP_VER(12, 50)) {
gt->steering_table[L3BANK] = icl_l3bank_steering_table;
gt->info.l3bank_mask =
~intel_uncore_read(gt->uncore, GEN10_MIRROR_FUSE3) &
GEN10_L3BANK_MASK;
} else if (HAS_MSLICES(i915)) {
MISSING_CASE(INTEL_INFO(i915)->platform);
}
intel_gt_mcr_init(gt);
return intel_engines_init_mmio(gt);
}
......@@ -451,7 +388,7 @@ void intel_gt_chipset_flush(struct intel_gt *gt)
{
wmb();
if (GRAPHICS_VER(gt->i915) < 6)
intel_gt_gmch_gen5_chipset_flush(gt);
intel_ggtt_gmch_flush();
}
void intel_gt_driver_register(struct intel_gt *gt)
......@@ -785,6 +722,7 @@ void intel_gt_driver_unregister(struct intel_gt *gt)
{
intel_wakeref_t wakeref;
intel_gt_sysfs_unregister(gt);
intel_rps_driver_unregister(&gt->rps);
intel_gsc_fini(&gt->gsc);
......@@ -834,200 +772,6 @@ void intel_gt_driver_late_release_all(struct drm_i915_private *i915)
}
}
/**
* intel_gt_reg_needs_read_steering - determine whether a register read
* requires explicit steering
* @gt: GT structure
* @reg: the register to check steering requirements for
* @type: type of multicast steering to check
*
* Determines whether @reg needs explicit steering of a specific type for
* reads.
*
* Returns false if @reg does not belong to a register range of the given
* steering type, or if the default (subslice-based) steering IDs are suitable
* for @type steering too.
*/
static bool intel_gt_reg_needs_read_steering(struct intel_gt *gt,
i915_reg_t reg,
enum intel_steering_type type)
{
const u32 offset = i915_mmio_reg_offset(reg);
const struct intel_mmio_range *entry;
if (likely(!intel_gt_needs_read_steering(gt, type)))
return false;
for (entry = gt->steering_table[type]; entry->end; entry++) {
if (offset >= entry->start && offset <= entry->end)
return true;
}
return false;
}
/**
* intel_gt_get_valid_steering - determines valid IDs for a class of MCR steering
* @gt: GT structure
* @type: multicast register type
* @sliceid: Slice ID returned
* @subsliceid: Subslice ID returned
*
* Determines sliceid and subsliceid values that will steer reads
* of a specific multicast register class to a valid value.
*/
static void intel_gt_get_valid_steering(struct intel_gt *gt,
enum intel_steering_type type,
u8 *sliceid, u8 *subsliceid)
{
switch (type) {
case L3BANK:
GEM_DEBUG_WARN_ON(!gt->info.l3bank_mask); /* should be impossible! */
*sliceid = 0; /* unused */
*subsliceid = __ffs(gt->info.l3bank_mask);
break;
case MSLICE:
GEM_DEBUG_WARN_ON(!gt->info.mslice_mask); /* should be impossible! */
*sliceid = __ffs(gt->info.mslice_mask);
*subsliceid = 0; /* unused */
break;
case LNCF:
GEM_DEBUG_WARN_ON(!gt->info.mslice_mask); /* should be impossible! */
/*
* An LNCF is always present if its mslice is present, so we
* can safely just steer to LNCF 0 in all cases.
*/
*sliceid = __ffs(gt->info.mslice_mask) << 1;
*subsliceid = 0; /* unused */
break;
default:
MISSING_CASE(type);
*sliceid = 0;
*subsliceid = 0;
}
}
/**
* intel_gt_read_register_fw - reads a GT register with support for multicast
* @gt: GT structure
* @reg: register to read
*
* This function will read a GT register. If the register is a multicast
* register, the read will be steered to a valid instance (i.e., one that
* isn't fused off or powered down by power gating).
*
* Returns the value from a valid instance of @reg.
*/
u32 intel_gt_read_register_fw(struct intel_gt *gt, i915_reg_t reg)
{
int type;
u8 sliceid, subsliceid;
for (type = 0; type < NUM_STEERING_TYPES; type++) {
if (intel_gt_reg_needs_read_steering(gt, reg, type)) {
intel_gt_get_valid_steering(gt, type, &sliceid,
&subsliceid);
return intel_uncore_read_with_mcr_steering_fw(gt->uncore,
reg,
sliceid,
subsliceid);
}
}
return intel_uncore_read_fw(gt->uncore, reg);
}
/**
* intel_gt_get_valid_steering_for_reg - get a valid steering for a register
* @gt: GT structure
* @reg: register for which the steering is required
* @sliceid: return variable for slice steering
* @subsliceid: return variable for subslice steering
*
* This function returns a slice/subslice pair that is guaranteed to work for
* read steering of the given register. Note that a value will be returned even
* if the register is not replicated and therefore does not actually require
* steering.
*/
void intel_gt_get_valid_steering_for_reg(struct intel_gt *gt, i915_reg_t reg,
u8 *sliceid, u8 *subsliceid)
{
int type;
for (type = 0; type < NUM_STEERING_TYPES; type++) {
if (intel_gt_reg_needs_read_steering(gt, reg, type)) {
intel_gt_get_valid_steering(gt, type, sliceid,
subsliceid);
return;
}
}
*sliceid = gt->default_steering.groupid;
*subsliceid = gt->default_steering.instanceid;
}
u32 intel_gt_read_register(struct intel_gt *gt, i915_reg_t reg)
{
int type;
u8 sliceid, subsliceid;
for (type = 0; type < NUM_STEERING_TYPES; type++) {
if (intel_gt_reg_needs_read_steering(gt, reg, type)) {
intel_gt_get_valid_steering(gt, type, &sliceid,
&subsliceid);
return intel_uncore_read_with_mcr_steering(gt->uncore,
reg,
sliceid,
subsliceid);
}
}
return intel_uncore_read(gt->uncore, reg);
}
static void report_steering_type(struct drm_printer *p,
struct intel_gt *gt,
enum intel_steering_type type,
bool dump_table)
{
const struct intel_mmio_range *entry;
u8 slice, subslice;
BUILD_BUG_ON(ARRAY_SIZE(intel_steering_types) != NUM_STEERING_TYPES);
if (!gt->steering_table[type]) {
drm_printf(p, "%s steering: uses default steering\n",
intel_steering_types[type]);
return;
}
intel_gt_get_valid_steering(gt, type, &slice, &subslice);
drm_printf(p, "%s steering: sliceid=0x%x, subsliceid=0x%x\n",
intel_steering_types[type], slice, subslice);
if (!dump_table)
return;
for (entry = gt->steering_table[type]; entry->end; entry++)
drm_printf(p, "\t0x%06x - 0x%06x\n", entry->start, entry->end);
}
void intel_gt_report_steering(struct drm_printer *p, struct intel_gt *gt,
bool dump_table)
{
drm_printf(p, "Default steering: sliceid=0x%x, subsliceid=0x%x\n",
gt->default_steering.groupid,
gt->default_steering.instanceid);
if (HAS_MSLICES(gt->i915)) {
report_steering_type(p, gt, MSLICE, dump_table);
report_steering_type(p, gt, LNCF, dump_table);
}
}
static int intel_gt_tile_setup(struct intel_gt *gt, phys_addr_t phys_addr)
{
int ret;
......
......@@ -13,13 +13,6 @@
struct drm_i915_private;
struct drm_printer;
struct insert_entries {
struct i915_address_space *vm;
struct i915_vma_resource *vma_res;
enum i915_cache_level level;
u32 flags;
};
#define GT_TRACE(gt, fmt, ...) do { \
const struct intel_gt *gt__ __maybe_unused = (gt); \
GEM_TRACE("%s " fmt, dev_name(gt__->i915->drm.dev), \
......@@ -93,21 +86,6 @@ static inline bool intel_gt_is_wedged(const struct intel_gt *gt)
return unlikely(test_bit(I915_WEDGED, &gt->reset.flags));
}
static inline bool intel_gt_needs_read_steering(struct intel_gt *gt,
enum intel_steering_type type)
{
return gt->steering_table[type];
}
void intel_gt_get_valid_steering_for_reg(struct intel_gt *gt, i915_reg_t reg,
u8 *sliceid, u8 *subsliceid);
u32 intel_gt_read_register_fw(struct intel_gt *gt, i915_reg_t reg);
u32 intel_gt_read_register(struct intel_gt *gt, i915_reg_t reg);
void intel_gt_report_steering(struct drm_printer *p, struct intel_gt *gt,
bool dump_table);
int intel_gt_probe_all(struct drm_i915_private *i915);
int intel_gt_tiles_init(struct drm_i915_private *i915);
void intel_gt_release_all(struct drm_i915_private *i915);
......@@ -125,6 +103,4 @@ void intel_gt_watchdog_work(struct work_struct *work);
void intel_gt_invalidate_tlbs(struct intel_gt *gt);
struct resource intel_pci_resource(struct pci_dev *pdev, int bar);
#endif /* __INTEL_GT_H__ */
......@@ -9,6 +9,7 @@
#include "intel_gt.h"
#include "intel_gt_debugfs.h"
#include "intel_gt_engines_debugfs.h"
#include "intel_gt_mcr.h"
#include "intel_gt_pm_debugfs.h"
#include "intel_sseu_debugfs.h"
#include "pxp/intel_pxp_debugfs.h"
......@@ -64,7 +65,7 @@ static int steering_show(struct seq_file *m, void *data)
struct drm_printer p = drm_seq_file_printer(m);
struct intel_gt *gt = m->private;
intel_gt_report_steering(&p, gt, true);
intel_gt_mcr_report_steering(&p, gt, true);
return 0;
}
......
This diff is collapsed.
/* SPDX-License-Identifier: MIT */
/*
* Copyright © 2022 Intel Corporation
*/
#ifndef __INTEL_GT_GMCH_H__
#define __INTEL_GT_GMCH_H__
#include "intel_gtt.h"
/* For x86 platforms */
#if IS_ENABLED(CONFIG_X86)
void intel_gt_gmch_gen5_chipset_flush(struct intel_gt *gt);
int intel_gt_gmch_gen6_probe(struct i915_ggtt *ggtt);
int intel_gt_gmch_gen8_probe(struct i915_ggtt *ggtt);
int intel_gt_gmch_gen5_probe(struct i915_ggtt *ggtt);
int intel_gt_gmch_gen5_enable_hw(struct drm_i915_private *i915);
/* Stubs for non-x86 platforms */
#else
static inline void intel_gt_gmch_gen5_chipset_flush(struct intel_gt *gt)
{
}
static inline int intel_gt_gmch_gen5_probe(struct i915_ggtt *ggtt)
{
/* No HW should be probed for this case yet, return fail */
return -ENODEV;
}
static inline int intel_gt_gmch_gen6_probe(struct i915_ggtt *ggtt)
{
/* No HW should be probed for this case yet, return fail */
return -ENODEV;
}
static inline int intel_gt_gmch_gen8_probe(struct i915_ggtt *ggtt)
{
/* No HW should be probed for this case yet, return fail */
return -ENODEV;
}
static inline int intel_gt_gmch_gen5_enable_hw(struct drm_i915_private *i915)
{
/* No HW should be enabled for this case yet, return fail */
return -ENODEV;
}
#endif
#endif /* __INTEL_GT_GMCH_H__ */
......@@ -193,6 +193,14 @@ void gen11_gt_irq_reset(struct intel_gt *gt)
/* Restore masks irqs on RCS, BCS, VCS and VECS engines. */
intel_uncore_write(uncore, GEN11_RCS0_RSVD_INTR_MASK, ~0);
intel_uncore_write(uncore, GEN11_BCS_RSVD_INTR_MASK, ~0);
if (HAS_ENGINE(gt, BCS1) || HAS_ENGINE(gt, BCS2))
intel_uncore_write(uncore, XEHPC_BCS1_BCS2_INTR_MASK, ~0);
if (HAS_ENGINE(gt, BCS3) || HAS_ENGINE(gt, BCS4))
intel_uncore_write(uncore, XEHPC_BCS3_BCS4_INTR_MASK, ~0);
if (HAS_ENGINE(gt, BCS5) || HAS_ENGINE(gt, BCS6))
intel_uncore_write(uncore, XEHPC_BCS5_BCS6_INTR_MASK, ~0);
if (HAS_ENGINE(gt, BCS7) || HAS_ENGINE(gt, BCS8))
intel_uncore_write(uncore, XEHPC_BCS7_BCS8_INTR_MASK, ~0);
intel_uncore_write(uncore, GEN11_VCS0_VCS1_INTR_MASK, ~0);
intel_uncore_write(uncore, GEN11_VCS2_VCS3_INTR_MASK, ~0);
if (HAS_ENGINE(gt, VCS4) || HAS_ENGINE(gt, VCS5))
......@@ -248,6 +256,14 @@ void gen11_gt_irq_postinstall(struct intel_gt *gt)
/* Unmask irqs on RCS, BCS, VCS and VECS engines. */
intel_uncore_write(uncore, GEN11_RCS0_RSVD_INTR_MASK, ~smask);
intel_uncore_write(uncore, GEN11_BCS_RSVD_INTR_MASK, ~smask);
if (HAS_ENGINE(gt, BCS1) || HAS_ENGINE(gt, BCS2))
intel_uncore_write(uncore, XEHPC_BCS1_BCS2_INTR_MASK, ~dmask);
if (HAS_ENGINE(gt, BCS3) || HAS_ENGINE(gt, BCS4))
intel_uncore_write(uncore, XEHPC_BCS3_BCS4_INTR_MASK, ~dmask);
if (HAS_ENGINE(gt, BCS5) || HAS_ENGINE(gt, BCS6))
intel_uncore_write(uncore, XEHPC_BCS5_BCS6_INTR_MASK, ~dmask);
if (HAS_ENGINE(gt, BCS7) || HAS_ENGINE(gt, BCS8))
intel_uncore_write(uncore, XEHPC_BCS7_BCS8_INTR_MASK, ~dmask);
intel_uncore_write(uncore, GEN11_VCS0_VCS1_INTR_MASK, ~dmask);
intel_uncore_write(uncore, GEN11_VCS2_VCS3_INTR_MASK, ~dmask);
if (HAS_ENGINE(gt, VCS4) || HAS_ENGINE(gt, VCS5))
......
This diff is collapsed.
/* SPDX-License-Identifier: MIT */
/*
* Copyright © 2022 Intel Corporation
*/
#ifndef __INTEL_GT_MCR__
#define __INTEL_GT_MCR__
#include "intel_gt_types.h"
void intel_gt_mcr_init(struct intel_gt *gt);
u32 intel_gt_mcr_read(struct intel_gt *gt,
i915_reg_t reg,
int group, int instance);
u32 intel_gt_mcr_read_any_fw(struct intel_gt *gt, i915_reg_t reg);
u32 intel_gt_mcr_read_any(struct intel_gt *gt, i915_reg_t reg);
void intel_gt_mcr_unicast_write(struct intel_gt *gt,
i915_reg_t reg, u32 value,
int group, int instance);
void intel_gt_mcr_multicast_write(struct intel_gt *gt,
i915_reg_t reg, u32 value);
void intel_gt_mcr_multicast_write_fw(struct intel_gt *gt,
i915_reg_t reg, u32 value);
void intel_gt_mcr_get_nonterminated_steering(struct intel_gt *gt,
i915_reg_t reg,
u8 *group, u8 *instance);
void intel_gt_mcr_report_steering(struct drm_printer *p, struct intel_gt *gt,
bool dump_table);
#endif /* __INTEL_GT_MCR__ */
......@@ -100,14 +100,16 @@ static int vlv_drpc(struct seq_file *m)
{
struct intel_gt *gt = m->private;
struct intel_uncore *uncore = gt->uncore;
u32 rcctl1, pw_status;
u32 rcctl1, pw_status, mt_fwake_req;
mt_fwake_req = intel_uncore_read_fw(uncore, FORCEWAKE_MT);
pw_status = intel_uncore_read(uncore, VLV_GTLC_PW_STATUS);
rcctl1 = intel_uncore_read(uncore, GEN6_RC_CONTROL);
seq_printf(m, "RC6 Enabled: %s\n",
str_yes_no(rcctl1 & (GEN7_RC_CTL_TO_MODE |
GEN6_RC_CTL_EI_MODE(1))));
seq_printf(m, "Multi-threaded Forcewake Request: 0x%x\n", mt_fwake_req);
seq_printf(m, "Render Power Well: %s\n",
(pw_status & VLV_GTLC_PW_RENDER_STATUS_MASK) ? "Up" : "Down");
seq_printf(m, "Media Power Well: %s\n",
......@@ -124,9 +126,10 @@ static int gen6_drpc(struct seq_file *m)
struct intel_gt *gt = m->private;
struct drm_i915_private *i915 = gt->i915;
struct intel_uncore *uncore = gt->uncore;
u32 gt_core_status, rcctl1, rc6vids = 0;
u32 gt_core_status, mt_fwake_req, rcctl1, rc6vids = 0;
u32 gen9_powergate_enable = 0, gen9_powergate_status = 0;
mt_fwake_req = intel_uncore_read_fw(uncore, FORCEWAKE_MT);
gt_core_status = intel_uncore_read_fw(uncore, GEN6_GT_CORE_STATUS);
rcctl1 = intel_uncore_read(uncore, GEN6_RC_CONTROL);
......@@ -178,6 +181,7 @@ static int gen6_drpc(struct seq_file *m)
seq_printf(m, "Core Power Down: %s\n",
str_yes_no(gt_core_status & GEN6_CORE_CPD_STATE_MASK));
seq_printf(m, "Multi-threaded Forcewake Request: 0x%x\n", mt_fwake_req);
if (GRAPHICS_VER(i915) >= 9) {
seq_printf(m, "Render Power Well: %s\n",
(gen9_powergate_status &
......
......@@ -140,6 +140,7 @@
#define FF_SLICE_CS_CHICKEN2 _MMIO(0x20e4)
#define GEN9_TSG_BARRIER_ACK_DISABLE (1 << 8)
#define GEN9_POOLED_EU_LOAD_BALANCING_FIX_DISABLE (1 << 10)
#define GEN12_PERF_FIX_BALANCING_CFE_DISABLE REG_BIT(15)
#define GEN9_CS_DEBUG_MODE1 _MMIO(0x20ec)
#define FF_DOP_CLOCK_GATE_DISABLE REG_BIT(1)
......@@ -323,8 +324,11 @@
#define GEN12_PAT_INDEX(index) _MMIO(0x4800 + (index) * 4)
#define XEHPSDV_FLAT_CCS_BASE_ADDR _MMIO(0x4910)
#define XEHPSDV_CCS_BASE_SHIFT 8
#define XEHP_TILE0_ADDR_RANGE _MMIO(0x4900)
#define XEHP_TILE_LMEM_RANGE_SHIFT 8
#define XEHP_FLAT_CCS_BASE_ADDR _MMIO(0x4910)
#define XEHP_CCS_BASE_SHIFT 8
#define GAMTARBMODE _MMIO(0x4a08)
#define ARB_MODE_BWGTLB_DISABLE (1 << 9)
......@@ -561,6 +565,7 @@
#define GEN11_GT_VEBOX_DISABLE_MASK (0x0f << GEN11_GT_VEBOX_DISABLE_SHIFT)
#define GEN12_GT_COMPUTE_DSS_ENABLE _MMIO(0x9144)
#define XEHPC_GT_COMPUTE_DSS_ENABLE_EXT _MMIO(0x9148)
#define GEN6_UCGCTL1 _MMIO(0x9400)
#define GEN6_GAMUNIT_CLOCK_GATE_DISABLE (1 << 22)
......@@ -597,24 +602,32 @@
/* GEN11 changed all bit defs except for FULL & RENDER */
#define GEN11_GRDOM_FULL GEN6_GRDOM_FULL
#define GEN11_GRDOM_RENDER GEN6_GRDOM_RENDER
#define GEN11_GRDOM_BLT (1 << 2)
#define GEN11_GRDOM_GUC (1 << 3)
#define GEN11_GRDOM_MEDIA (1 << 5)
#define GEN11_GRDOM_MEDIA2 (1 << 6)
#define GEN11_GRDOM_MEDIA3 (1 << 7)
#define GEN11_GRDOM_MEDIA4 (1 << 8)
#define GEN11_GRDOM_MEDIA5 (1 << 9)
#define GEN11_GRDOM_MEDIA6 (1 << 10)
#define GEN11_GRDOM_MEDIA7 (1 << 11)
#define GEN11_GRDOM_MEDIA8 (1 << 12)
#define GEN11_GRDOM_VECS (1 << 13)
#define GEN11_GRDOM_VECS2 (1 << 14)
#define GEN11_GRDOM_VECS3 (1 << 15)
#define GEN11_GRDOM_VECS4 (1 << 16)
#define GEN11_GRDOM_SFC0 (1 << 17)
#define GEN11_GRDOM_SFC1 (1 << 18)
#define GEN11_GRDOM_SFC2 (1 << 19)
#define GEN11_GRDOM_SFC3 (1 << 20)
#define XEHPC_GRDOM_BLT8 REG_BIT(31)
#define XEHPC_GRDOM_BLT7 REG_BIT(30)
#define XEHPC_GRDOM_BLT6 REG_BIT(29)
#define XEHPC_GRDOM_BLT5 REG_BIT(28)
#define XEHPC_GRDOM_BLT4 REG_BIT(27)
#define XEHPC_GRDOM_BLT3 REG_BIT(26)
#define XEHPC_GRDOM_BLT2 REG_BIT(25)
#define XEHPC_GRDOM_BLT1 REG_BIT(24)
#define GEN11_GRDOM_SFC3 REG_BIT(20)
#define GEN11_GRDOM_SFC2 REG_BIT(19)
#define GEN11_GRDOM_SFC1 REG_BIT(18)
#define GEN11_GRDOM_SFC0 REG_BIT(17)
#define GEN11_GRDOM_VECS4 REG_BIT(16)
#define GEN11_GRDOM_VECS3 REG_BIT(15)
#define GEN11_GRDOM_VECS2 REG_BIT(14)
#define GEN11_GRDOM_VECS REG_BIT(13)
#define GEN11_GRDOM_MEDIA8 REG_BIT(12)
#define GEN11_GRDOM_MEDIA7 REG_BIT(11)
#define GEN11_GRDOM_MEDIA6 REG_BIT(10)
#define GEN11_GRDOM_MEDIA5 REG_BIT(9)
#define GEN11_GRDOM_MEDIA4 REG_BIT(8)
#define GEN11_GRDOM_MEDIA3 REG_BIT(7)
#define GEN11_GRDOM_MEDIA2 REG_BIT(6)
#define GEN11_GRDOM_MEDIA REG_BIT(5)
#define GEN11_GRDOM_GUC REG_BIT(3)
#define GEN11_GRDOM_BLT REG_BIT(2)
#define GEN11_VCS_SFC_RESET_BIT(instance) (GEN11_GRDOM_SFC0 << ((instance) >> 1))
#define GEN11_VECS_SFC_RESET_BIT(instance) (GEN11_GRDOM_SFC0 << (instance))
......@@ -622,6 +635,7 @@
#define GEN7_MISCCPCTL _MMIO(0x9424)
#define GEN7_DOP_CLOCK_GATE_ENABLE (1 << 0)
#define GEN12_DOP_CLOCK_GATE_RENDER_ENABLE REG_BIT(1)
#define GEN8_DOP_CLOCK_GATE_CFCLK_ENABLE (1 << 2)
#define GEN8_DOP_CLOCK_GATE_GUC_ENABLE (1 << 4)
#define GEN8_DOP_CLOCK_GATE_MEDIA_ENABLE (1 << 6)
......@@ -732,6 +746,7 @@
#define GEN6_AGGRESSIVE_TURBO (0 << 15)
#define GEN9_SW_REQ_UNSLICE_RATIO_SHIFT 23
#define GEN9_IGNORE_SLICE_RATIO (0 << 0)
#define GEN12_MEDIA_FREQ_RATIO REG_BIT(13)
#define GEN6_RC_VIDEO_FREQ _MMIO(0xa00c)
#define GEN6_RC_CTL_RC6pp_ENABLE (1 << 16)
......@@ -969,6 +984,11 @@
#define XEHP_L3SCQREG7 _MMIO(0xb188)
#define BLEND_FILL_CACHING_OPT_DIS REG_BIT(3)
#define XEHPC_L3SCRUB _MMIO(0xb18c)
#define SCRUB_CL_DWNGRADE_SHARED REG_BIT(12)
#define SCRUB_RATE_PER_BANK_MASK REG_GENMASK(2, 0)
#define SCRUB_RATE_4B_PER_CLK REG_FIELD_PREP(SCRUB_RATE_PER_BANK_MASK, 0x6)
#define L3SQCREG1_CCS0 _MMIO(0xb200)
#define FLUSHALLNONCOH REG_BIT(5)
......@@ -1060,8 +1080,10 @@
#define GEN9_ENABLE_GPGPU_PREEMPTION REG_BIT(2)
#define GEN10_CACHE_MODE_SS _MMIO(0xe420)
#define ENABLE_PREFETCH_INTO_IC REG_BIT(3)
#define ENABLE_EU_COUNT_FOR_TDL_FLUSH REG_BIT(10)
#define DISABLE_ECC REG_BIT(5)
#define FLOAT_BLEND_OPTIMIZATION_ENABLE REG_BIT(4)
#define ENABLE_PREFETCH_INTO_IC REG_BIT(3)
#define EU_PERF_CNTL0 _MMIO(0xe458)
#define EU_PERF_CNTL4 _MMIO(0xe45c)
......@@ -1476,6 +1498,14 @@
#define GEN11_KCR (19)
#define GEN11_GTPM (16)
#define GEN11_BCS (15)
#define XEHPC_BCS1 (14)
#define XEHPC_BCS2 (13)
#define XEHPC_BCS3 (12)
#define XEHPC_BCS4 (11)
#define XEHPC_BCS5 (10)
#define XEHPC_BCS6 (9)
#define XEHPC_BCS7 (8)
#define XEHPC_BCS8 (23)
#define GEN12_CCS3 (7)
#define GEN12_CCS2 (6)
#define GEN12_CCS1 (5)
......@@ -1521,6 +1551,10 @@
#define GEN11_GUNIT_CSME_INTR_MASK _MMIO(0x1900f4)
#define GEN12_CCS0_CCS1_INTR_MASK _MMIO(0x190100)
#define GEN12_CCS2_CCS3_INTR_MASK _MMIO(0x190104)
#define XEHPC_BCS1_BCS2_INTR_MASK _MMIO(0x190110)
#define XEHPC_BCS3_BCS4_INTR_MASK _MMIO(0x190114)
#define XEHPC_BCS5_BCS6_INTR_MASK _MMIO(0x190118)
#define XEHPC_BCS7_BCS8_INTR_MASK _MMIO(0x19011c)
#define GEN12_SFC_DONE(n) _MMIO(0x1cc000 + (n) * 0x1000)
......
......@@ -24,7 +24,7 @@ bool is_object_gt(struct kobject *kobj)
static struct intel_gt *kobj_to_gt(struct kobject *kobj)
{
return container_of(kobj, struct kobj_gt, base)->gt;
return container_of(kobj, struct intel_gt, sysfs_gt);
}
struct intel_gt *intel_gt_sysfs_get_drvdata(struct device *dev,
......@@ -72,9 +72,9 @@ static struct attribute *id_attrs[] = {
};
ATTRIBUTE_GROUPS(id);
/* A kobject needs a release() method even if it does nothing */
static void kobj_gt_release(struct kobject *kobj)
{
kfree(kobj);
}
static struct kobj_type kobj_gt_type = {
......@@ -85,8 +85,6 @@ static struct kobj_type kobj_gt_type = {
void intel_gt_sysfs_register(struct intel_gt *gt)
{
struct kobj_gt *kg;
/*
* We need to make things right with the
* ABI compatibility. The files were originally
......@@ -98,25 +96,22 @@ void intel_gt_sysfs_register(struct intel_gt *gt)
if (gt_is_root(gt))
intel_gt_sysfs_pm_init(gt, gt_get_parent_obj(gt));
kg = kzalloc(sizeof(*kg), GFP_KERNEL);
if (!kg)
/* init and xfer ownership to sysfs tree */
if (kobject_init_and_add(&gt->sysfs_gt, &kobj_gt_type,
gt->i915->sysfs_gt, "gt%d", gt->info.id))
goto exit_fail;
kobject_init(&kg->base, &kobj_gt_type);
kg->gt = gt;
/* xfer ownership to sysfs tree */
if (kobject_add(&kg->base, gt->i915->sysfs_gt, "gt%d", gt->info.id))
goto exit_kobj_put;
intel_gt_sysfs_pm_init(gt, &kg->base);
intel_gt_sysfs_pm_init(gt, &gt->sysfs_gt);
return;
exit_kobj_put:
kobject_put(&kg->base);
exit_fail:
kobject_put(&gt->sysfs_gt);
drm_warn(&gt->i915->drm,
"failed to initialize gt%d sysfs root\n", gt->info.id);
}
void intel_gt_sysfs_unregister(struct intel_gt *gt)
{
kobject_put(&gt->sysfs_gt);
}
......@@ -13,11 +13,6 @@
struct intel_gt;
struct kobj_gt {
struct kobject base;
struct intel_gt *gt;
};
bool is_object_gt(struct kobject *kobj);
struct drm_i915_private *kobj_to_i915(struct kobject *kobj);
......@@ -28,6 +23,7 @@ intel_gt_create_kobj(struct intel_gt *gt,
const char *name);
void intel_gt_sysfs_register(struct intel_gt *gt);
void intel_gt_sysfs_unregister(struct intel_gt *gt);
struct intel_gt *intel_gt_sysfs_get_drvdata(struct device *dev,
const char *name);
......
......@@ -14,6 +14,7 @@
#include "intel_gt_regs.h"
#include "intel_gt_sysfs.h"
#include "intel_gt_sysfs_pm.h"
#include "intel_pcode.h"
#include "intel_rc6.h"
#include "intel_rps.h"
......@@ -558,6 +559,174 @@ static const struct attribute *freq_attrs[] = {
NULL
};
/*
* Scaling for multipliers (aka frequency factors).
* The format of the value in the register is u8.8.
*
* The presentation to userspace is inspired by the perf event framework.
* See:
* Documentation/ABI/testing/sysfs-bus-event_source-devices-events
* for description of:
* /sys/bus/event_source/devices/<pmu>/events/<event>.scale
*
* Summary: Expose two sysfs files for each multiplier.
*
* 1. File <attr> contains a raw hardware value.
* 2. File <attr>.scale contains the multiplicative scale factor to be
* used by userspace to compute the actual value.
*
* So userspace knows that to get the frequency_factor it multiplies the
* provided value by the specified scale factor and vice-versa.
*
* That way there is no precision loss in the kernel interface and API
* is future proof should one day the hardware register change to u16.u16,
* on some platform. (Or any other fixed point representation.)
*
* Example:
* File <attr> contains the value 2.5, represented as u8.8 0x0280, which
* is comprised of:
* - an integer part of 2
* - a fractional part of 0x80 (representing 0x80 / 2^8 == 0x80 / 256).
* File <attr>.scale contains a string representation of floating point
* value 0.00390625 (which is (1 / 256)).
* Userspace computes the actual value:
* 0x0280 * 0.00390625 -> 2.5
* or converts an actual value to the value to be written into <attr>:
* 2.5 / 0.00390625 -> 0x0280
*/
#define U8_8_VAL_MASK 0xffff
#define U8_8_SCALE_TO_VALUE "0.00390625"
static ssize_t freq_factor_scale_show(struct device *dev,
struct device_attribute *attr,
char *buff)
{
return sysfs_emit(buff, "%s\n", U8_8_SCALE_TO_VALUE);
}
static u32 media_ratio_mode_to_factor(u32 mode)
{
/* 0 -> 0, 1 -> 256, 2 -> 128 */
return !mode ? mode : 256 / mode;
}
static ssize_t media_freq_factor_show(struct device *dev,
struct device_attribute *attr,
char *buff)
{
struct intel_gt *gt = intel_gt_sysfs_get_drvdata(dev, attr->attr.name);
struct intel_guc_slpc *slpc = &gt->uc.guc.slpc;
intel_wakeref_t wakeref;
u32 mode;
/*
* Retrieve media_ratio_mode from GEN6_RPNSWREQ bit 13 set by
* GuC. GEN6_RPNSWREQ:13 value 0 represents 1:2 and 1 represents 1:1
*/
if (IS_XEHPSDV(gt->i915) &&
slpc->media_ratio_mode == SLPC_MEDIA_RATIO_MODE_DYNAMIC_CONTROL) {
/*
* For XEHPSDV dynamic mode GEN6_RPNSWREQ:13 does not contain
* the media_ratio_mode, just return the cached media ratio
*/
mode = slpc->media_ratio_mode;
} else {
with_intel_runtime_pm(gt->uncore->rpm, wakeref)
mode = intel_uncore_read(gt->uncore, GEN6_RPNSWREQ);
mode = REG_FIELD_GET(GEN12_MEDIA_FREQ_RATIO, mode) ?
SLPC_MEDIA_RATIO_MODE_FIXED_ONE_TO_ONE :
SLPC_MEDIA_RATIO_MODE_FIXED_ONE_TO_TWO;
}
return sysfs_emit(buff, "%u\n", media_ratio_mode_to_factor(mode));
}
static ssize_t media_freq_factor_store(struct device *dev,
struct device_attribute *attr,
const char *buff, size_t count)
{
struct intel_gt *gt = intel_gt_sysfs_get_drvdata(dev, attr->attr.name);
struct intel_guc_slpc *slpc = &gt->uc.guc.slpc;
u32 factor, mode;
int err;
err = kstrtou32(buff, 0, &factor);
if (err)
return err;
for (mode = SLPC_MEDIA_RATIO_MODE_DYNAMIC_CONTROL;
mode <= SLPC_MEDIA_RATIO_MODE_FIXED_ONE_TO_TWO; mode++)
if (factor == media_ratio_mode_to_factor(mode))
break;
if (mode > SLPC_MEDIA_RATIO_MODE_FIXED_ONE_TO_TWO)
return -EINVAL;
err = intel_guc_slpc_set_media_ratio_mode(slpc, mode);
if (!err) {
slpc->media_ratio_mode = mode;
DRM_DEBUG("Set slpc->media_ratio_mode to %d", mode);
}
return err ?: count;
}
static ssize_t media_RP0_freq_mhz_show(struct device *dev,
struct device_attribute *attr,
char *buff)
{
struct intel_gt *gt = intel_gt_sysfs_get_drvdata(dev, attr->attr.name);
u32 val;
int err;
err = snb_pcode_read_p(gt->uncore, XEHP_PCODE_FREQUENCY_CONFIG,
PCODE_MBOX_FC_SC_READ_FUSED_P0,
PCODE_MBOX_DOMAIN_MEDIAFF, &val);
if (err)
return err;
/* Fused media RP0 read from pcode is in units of 50 MHz */
val *= GT_FREQUENCY_MULTIPLIER;
return sysfs_emit(buff, "%u\n", val);
}
static ssize_t media_RPn_freq_mhz_show(struct device *dev,
struct device_attribute *attr,
char *buff)
{
struct intel_gt *gt = intel_gt_sysfs_get_drvdata(dev, attr->attr.name);
u32 val;
int err;
err = snb_pcode_read_p(gt->uncore, XEHP_PCODE_FREQUENCY_CONFIG,
PCODE_MBOX_FC_SC_READ_FUSED_PN,
PCODE_MBOX_DOMAIN_MEDIAFF, &val);
if (err)
return err;
/* Fused media RPn read from pcode is in units of 50 MHz */
val *= GT_FREQUENCY_MULTIPLIER;
return sysfs_emit(buff, "%u\n", val);
}
static DEVICE_ATTR_RW(media_freq_factor);
static struct device_attribute dev_attr_media_freq_factor_scale =
__ATTR(media_freq_factor.scale, 0444, freq_factor_scale_show, NULL);
static DEVICE_ATTR_RO(media_RP0_freq_mhz);
static DEVICE_ATTR_RO(media_RPn_freq_mhz);
static const struct attribute *media_perf_power_attrs[] = {
&dev_attr_media_freq_factor.attr,
&dev_attr_media_freq_factor_scale.attr,
&dev_attr_media_RP0_freq_mhz.attr,
&dev_attr_media_RPn_freq_mhz.attr,
NULL
};
static int intel_sysfs_rps_init(struct intel_gt *gt, struct kobject *kobj,
const struct attribute * const *attrs)
{
......@@ -599,4 +768,12 @@ void intel_gt_sysfs_pm_init(struct intel_gt *gt, struct kobject *kobj)
drm_warn(&gt->i915->drm,
"failed to create gt%u throttle sysfs files (%pe)",
gt->info.id, ERR_PTR(ret));
if (HAS_MEDIA_RATIO_MODE(gt->i915) && intel_uc_uses_guc_slpc(&gt->uc)) {
ret = sysfs_create_files(kobj, media_perf_power_attrs);
if (ret)
drm_warn(&gt->i915->drm,
"failed to create gt%u media_perf_power_attrs sysfs (%pe)\n",
gt->info.id, ERR_PTR(ret));
}
}
......@@ -59,6 +59,13 @@ enum intel_steering_type {
MSLICE,
LNCF,
/*
* On some platforms there are multiple types of MCR registers that
* will always return a non-terminated value at instance (0, 0). We'll
* lump those all into a single category to keep things simple.
*/
INSTANCE0,
NUM_STEERING_TYPES
};
......@@ -221,9 +228,13 @@ struct intel_gt {
struct {
u8 uc_index;
u8 wb_index; /* Only used on HAS_L3_CCS_READ() platforms */
} mocs;
struct intel_pxp pxp;
/* gt/gtN sysfs */
struct kobject sysfs_gt;
};
enum intel_gt_scratch_field {
......
......@@ -306,6 +306,15 @@ struct i915_address_space {
struct i915_vma_resource *vma_res,
enum i915_cache_level cache_level,
u32 flags);
void (*raw_insert_page)(struct i915_address_space *vm,
dma_addr_t addr,
u64 offset,
enum i915_cache_level cache_level,
u32 flags);
void (*raw_insert_entries)(struct i915_address_space *vm,
struct i915_vma_resource *vma_res,
enum i915_cache_level cache_level,
u32 flags);
void (*cleanup)(struct i915_address_space *vm);
void (*foreach)(struct i915_address_space *vm,
......@@ -345,6 +354,19 @@ struct i915_ggtt {
bool do_idle_maps;
/**
* @pte_lost: Are ptes lost on resume?
*
* Whether the system was recently restored from hibernate and
* thus may have lost pte content.
*/
bool pte_lost;
/**
* @probed_pte: Probed pte value on suspend. Re-checked on resume.
*/
u64 probed_pte;
int mtrr;
/** Bit 6 swizzling required for X tiling */
......@@ -548,14 +570,13 @@ i915_page_dir_dma_addr(const struct i915_ppgtt *ppgtt, const unsigned int n)
void ppgtt_init(struct i915_ppgtt *ppgtt, struct intel_gt *gt,
unsigned long lmem_pt_obj_flags);
void intel_ggtt_bind_vma(struct i915_address_space *vm,
struct i915_vm_pt_stash *stash,
struct i915_vma_resource *vma_res,
enum i915_cache_level cache_level,
u32 flags);
struct i915_vm_pt_stash *stash,
struct i915_vma_resource *vma_res,
enum i915_cache_level cache_level,
u32 flags);
void intel_ggtt_unbind_vma(struct i915_address_space *vm,
struct i915_vma_resource *vma_res);
struct i915_vma_resource *vma_res);
int i915_ggtt_probe_hw(struct drm_i915_private *i915);
int i915_ggtt_init_hw(struct drm_i915_private *i915);
......@@ -581,6 +602,17 @@ bool i915_ggtt_resume_vm(struct i915_address_space *vm);
void i915_ggtt_suspend(struct i915_ggtt *gtt);
void i915_ggtt_resume(struct i915_ggtt *ggtt);
/**
* i915_ggtt_mark_pte_lost - Mark ggtt ptes as lost or clear such a marking
* @i915 The device private.
* @val whether the ptes should be marked as lost.
*
* In some cases pte content is retained across suspend, but typically lost
* across hibernate. Typically they should be marked as lost on
* hibernation restore and such marking cleared on suspend.
*/
void i915_ggtt_mark_pte_lost(struct drm_i915_private *i915, bool val);
void
fill_page_dma(struct drm_i915_gem_object *p, const u64 val, unsigned int count);
......@@ -627,7 +659,6 @@ release_pd_entry(struct i915_page_directory * const pd,
struct i915_page_table * const pt,
const struct drm_i915_gem_object * const scratch);
void gen6_ggtt_invalidate(struct i915_ggtt *ggtt);
void gen8_ggtt_invalidate(struct i915_ggtt *ggtt);
void ppgtt_bind_vma(struct i915_address_space *vm,
struct i915_vm_pt_stash *stash,
......
......@@ -111,16 +111,6 @@ enum {
#define XEHP_SW_COUNTER_SHIFT 58
#define XEHP_SW_COUNTER_WIDTH 6
static inline u32 lrc_desc_priority(int prio)
{
if (prio > I915_PRIORITY_NORMAL)
return GEN12_CTX_PRIORITY_HIGH;
else if (prio < I915_PRIORITY_NORMAL)
return GEN12_CTX_PRIORITY_LOW;
else
return GEN12_CTX_PRIORITY_NORMAL;
}
static inline void lrc_runtime_start(struct intel_context *ce)
{
struct intel_context_stats *stats = &ce->stats;
......
......@@ -23,6 +23,7 @@ struct drm_i915_mocs_table {
unsigned int n_entries;
const struct drm_i915_mocs_entry *table;
u8 uc_index;
u8 wb_index; /* Only used on HAS_L3_CCS_READ() platforms */
u8 unused_entries_index;
};
......@@ -47,6 +48,7 @@ struct drm_i915_mocs_table {
/* Helper defines */
#define GEN9_NUM_MOCS_ENTRIES 64 /* 63-64 are reserved, but configured. */
#define PVC_NUM_MOCS_ENTRIES 3
/* (e)LLC caching options */
/*
......@@ -394,6 +396,17 @@ static const struct drm_i915_mocs_entry dg2_mocs_table_g10_ax[] = {
MOCS_ENTRY(3, 0, L3_3_WB | L3_LKUP(1)),
};
static const struct drm_i915_mocs_entry pvc_mocs_table[] = {
/* Error */
MOCS_ENTRY(0, 0, L3_3_WB),
/* UC */
MOCS_ENTRY(1, 0, L3_1_UC),
/* WB */
MOCS_ENTRY(2, 0, L3_3_WB),
};
enum {
HAS_GLOBAL_MOCS = BIT(0),
HAS_ENGINE_MOCS = BIT(1),
......@@ -423,7 +436,14 @@ static unsigned int get_mocs_settings(const struct drm_i915_private *i915,
memset(table, 0, sizeof(struct drm_i915_mocs_table));
table->unused_entries_index = I915_MOCS_PTE;
if (IS_DG2(i915)) {
if (IS_PONTEVECCHIO(i915)) {
table->size = ARRAY_SIZE(pvc_mocs_table);
table->table = pvc_mocs_table;
table->n_entries = PVC_NUM_MOCS_ENTRIES;
table->uc_index = 1;
table->wb_index = 2;
table->unused_entries_index = 2;
} else if (IS_DG2(i915)) {
if (IS_DG2_GRAPHICS_STEP(i915, G10, STEP_A0, STEP_B0)) {
table->size = ARRAY_SIZE(dg2_mocs_table_g10_ax);
table->table = dg2_mocs_table_g10_ax;
......@@ -622,6 +642,8 @@ void intel_set_mocs_index(struct intel_gt *gt)
get_mocs_settings(gt->i915, &table);
gt->mocs.uc_index = table.uc_index;
if (HAS_L3_CCS_READ(gt->i915))
gt->mocs.wb_index = table.wb_index;
}
void intel_mocs_init(struct intel_gt *gt)
......
......@@ -12,6 +12,7 @@
#include "gem/i915_gem_region.h"
#include "gem/i915_gem_ttm.h"
#include "gt/intel_gt.h"
#include "gt/intel_gt_mcr.h"
#include "gt/intel_gt_regs.h"
static int
......@@ -101,14 +102,24 @@ static struct intel_memory_region *setup_lmem(struct intel_gt *gt)
return ERR_PTR(-ENODEV);
if (HAS_FLAT_CCS(i915)) {
resource_size_t lmem_range;
u64 tile_stolen, flat_ccs_base;
lmem_size = pci_resource_len(pdev, 2);
flat_ccs_base = intel_gt_read_register(gt, XEHPSDV_FLAT_CCS_BASE_ADDR);
flat_ccs_base = (flat_ccs_base >> XEHPSDV_CCS_BASE_SHIFT) * SZ_64K;
lmem_range = intel_gt_mcr_read_any(&i915->gt0, XEHP_TILE0_ADDR_RANGE) & 0xFFFF;
lmem_size = lmem_range >> XEHP_TILE_LMEM_RANGE_SHIFT;
lmem_size *= SZ_1G;
flat_ccs_base = intel_gt_mcr_read_any(gt, XEHP_FLAT_CCS_BASE_ADDR);
flat_ccs_base = (flat_ccs_base >> XEHP_CCS_BASE_SHIFT) * SZ_64K;
/* FIXME: Remove this when we have small-bar enabled */
if (pci_resource_len(pdev, 2) < lmem_size) {
drm_err(&i915->drm, "System requires small-BAR support, which is currently unsupported on this kernel\n");
return ERR_PTR(-EINVAL);
}
if (GEM_WARN_ON(lmem_size < flat_ccs_base))
return ERR_PTR(-ENODEV);
return ERR_PTR(-EIO);
tile_stolen = lmem_size - flat_ccs_base;
......@@ -131,7 +142,7 @@ static struct intel_memory_region *setup_lmem(struct intel_gt *gt)
io_start = pci_resource_start(pdev, 2);
io_size = min(pci_resource_len(pdev, 2), lmem_size);
if (!io_size)
return ERR_PTR(-ENODEV);
return ERR_PTR(-EIO);
min_page_size = HAS_64K_PAGES(i915) ? I915_GTT_PAGE_SIZE_64K :
I915_GTT_PAGE_SIZE_4K;
......
......@@ -117,7 +117,9 @@ static void flush_cs_tlb(struct intel_engine_cs *engine)
return;
/* ring should be idle before issuing a sync flush*/
GEM_DEBUG_WARN_ON((ENGINE_READ(engine, RING_MI_MODE) & MODE_IDLE) == 0);
if ((ENGINE_READ(engine, RING_MI_MODE) & MODE_IDLE) == 0)
drm_warn(&engine->i915->drm, "%s not idle before sync flush!\n",
engine->name);
ENGINE_WRITE_FW(engine, RING_INSTPM,
_MASKED_BIT_ENABLE(INSTPM_TLB_INVALIDATE |
......@@ -596,8 +598,9 @@ static void ring_context_reset(struct intel_context *ce)
clear_bit(CONTEXT_VALID_BIT, &ce->flags);
}
static void ring_context_ban(struct intel_context *ce,
struct i915_request *rq)
static void ring_context_revoke(struct intel_context *ce,
struct i915_request *rq,
unsigned int preempt_timeout_ms)
{
struct intel_engine_cs *engine;
......@@ -632,7 +635,7 @@ static const struct intel_context_ops ring_context_ops = {
.cancel_request = ring_context_cancel_request,
.ban = ring_context_ban,
.revoke = ring_context_revoke,
.pre_pin = ring_context_pre_pin,
.pin = ring_context_pin,
......
......@@ -1075,7 +1075,9 @@ static u32 intel_rps_read_state_cap(struct intel_rps *rps)
struct drm_i915_private *i915 = rps_to_i915(rps);
struct intel_uncore *uncore = rps_to_uncore(rps);
if (IS_XEHPSDV(i915))
if (IS_PONTEVECCHIO(i915))
return intel_uncore_read(uncore, PVC_RP_STATE_CAP);
else if (IS_XEHPSDV(i915))
return intel_uncore_read(uncore, XEHPSDV_RP_STATE_CAP);
else if (IS_GEN9_LP(i915))
return intel_uncore_read(uncore, BXT_RP_STATE_CAP);
......
This diff is collapsed.
......@@ -25,12 +25,16 @@ struct drm_printer;
/*
* Maximum number of subslices that can exist within a HSW-style slice. This
* is only relevant to pre-Xe_HP platforms (Xe_HP and beyond use the
* GEN_MAX_DSS value below).
* I915_MAX_SS_FUSE_BITS value below).
*/
#define GEN_MAX_SS_PER_HSW_SLICE 6
/* Maximum number of DSS on newer platforms (Xe_HP and beyond). */
#define GEN_MAX_DSS 32
/*
* Maximum number of 32-bit registers used by hardware to express the
* enabled/disabled subslices.
*/
#define I915_MAX_SS_FUSE_REGS 2
#define I915_MAX_SS_FUSE_BITS (I915_MAX_SS_FUSE_REGS * 32)
/* Maximum number of EUs that can exist within a subslice or DSS. */
#define GEN_MAX_EUS_PER_SS 16
......@@ -38,7 +42,7 @@ struct drm_printer;
#define SSEU_MAX(a, b) ((a) > (b) ? (a) : (b))
/* The maximum number of bits needed to express each subslice/DSS independently */
#define GEN_SS_MASK_SIZE SSEU_MAX(GEN_MAX_DSS, \
#define GEN_SS_MASK_SIZE SSEU_MAX(I915_MAX_SS_FUSE_BITS, \
GEN_MAX_HSW_SLICES * GEN_MAX_SS_PER_HSW_SLICE)
#define GEN_SSEU_STRIDE(max_entries) DIV_ROUND_UP(max_entries, BITS_PER_BYTE)
......@@ -49,15 +53,28 @@ struct drm_printer;
#define GEN_DSS_PER_CSLICE 8
#define GEN_DSS_PER_MSLICE 8
#define GEN_MAX_GSLICES (GEN_MAX_DSS / GEN_DSS_PER_GSLICE)
#define GEN_MAX_CSLICES (GEN_MAX_DSS / GEN_DSS_PER_CSLICE)
#define GEN_MAX_GSLICES (I915_MAX_SS_FUSE_BITS / GEN_DSS_PER_GSLICE)
#define GEN_MAX_CSLICES (I915_MAX_SS_FUSE_BITS / GEN_DSS_PER_CSLICE)
typedef union {
u8 hsw[GEN_MAX_HSW_SLICES];
/* Bitmap compatible with linux/bitmap.h; may exceed size of u64 */
unsigned long xehp[BITS_TO_LONGS(I915_MAX_SS_FUSE_BITS)];
} intel_sseu_ss_mask_t;
#define XEHP_BITMAP_BITS(mask) ((int)BITS_PER_TYPE(typeof(mask.xehp)))
struct sseu_dev_info {
u8 slice_mask;
u8 subslice_mask[GEN_SS_MASK_SIZE];
u8 geometry_subslice_mask[GEN_SS_MASK_SIZE];
u8 compute_subslice_mask[GEN_SS_MASK_SIZE];
u8 eu_mask[GEN_SS_MASK_SIZE * GEN_MAX_EU_STRIDE];
intel_sseu_ss_mask_t subslice_mask;
intel_sseu_ss_mask_t geometry_subslice_mask;
intel_sseu_ss_mask_t compute_subslice_mask;
union {
u16 hsw[GEN_MAX_HSW_SLICES][GEN_MAX_SS_PER_HSW_SLICE];
u16 xehp[I915_MAX_SS_FUSE_BITS];
} eu_mask;
u16 eu_total;
u8 eu_per_subslice;
u8 min_eu_in_pool;
......@@ -66,14 +83,16 @@ struct sseu_dev_info {
u8 has_slice_pg:1;
u8 has_subslice_pg:1;
u8 has_eu_pg:1;
/*
* For Xe_HP and beyond, the hardware no longer has traditional slices
* so we just report the entire DSS pool under a fake "slice 0."
*/
u8 has_xehp_dss:1;
/* Topology fields */
u8 max_slices;
u8 max_subslices;
u8 max_eus_per_subslice;
u8 ss_stride;
u8 eu_stride;
};
/*
......@@ -91,7 +110,7 @@ intel_sseu_from_device_info(const struct sseu_dev_info *sseu)
{
struct intel_sseu value = {
.slice_mask = sseu->slice_mask,
.subslice_mask = sseu->subslice_mask[0],
.subslice_mask = sseu->subslice_mask.hsw[0],
.min_eus_per_subslice = sseu->max_eus_per_subslice,
.max_eus_per_subslice = sseu->max_eus_per_subslice,
};
......@@ -103,18 +122,28 @@ static inline bool
intel_sseu_has_subslice(const struct sseu_dev_info *sseu, int slice,
int subslice)
{
u8 mask;
int ss_idx = subslice / BITS_PER_BYTE;
if (slice >= sseu->max_slices ||
subslice >= sseu->max_subslices)
return false;
GEM_BUG_ON(ss_idx >= sseu->ss_stride);
mask = sseu->subslice_mask[slice * sseu->ss_stride + ss_idx];
if (sseu->has_xehp_dss)
return test_bit(subslice, sseu->subslice_mask.xehp);
else
return sseu->subslice_mask.hsw[slice] & BIT(subslice);
}
return mask & BIT(subslice % BITS_PER_BYTE);
/*
* Used to obtain the index of the first DSS. Can start searching from the
* beginning of a specific dss group (e.g., gslice, cslice, etc.) if
* groupsize and groupnum are non-zero.
*/
static inline unsigned int
intel_sseu_find_first_xehp_dss(const struct sseu_dev_info *sseu, int groupsize,
int groupnum)
{
return find_next_bit(sseu->subslice_mask.xehp,
XEHP_BITMAP_BITS(sseu->subslice_mask),
groupnum * groupsize);
}
void intel_sseu_set_info(struct sseu_dev_info *sseu, u8 max_slices,
......@@ -124,14 +153,10 @@ unsigned int
intel_sseu_subslice_total(const struct sseu_dev_info *sseu);
unsigned int
intel_sseu_subslices_per_slice(const struct sseu_dev_info *sseu, u8 slice);
intel_sseu_get_hsw_subslices(const struct sseu_dev_info *sseu, u8 slice);
u32 intel_sseu_get_subslices(const struct sseu_dev_info *sseu, u8 slice);
u32 intel_sseu_get_compute_subslices(const struct sseu_dev_info *sseu);
void intel_sseu_set_subslices(struct sseu_dev_info *sseu, int slice,
u8 *subslice_mask, u32 ss_mask);
intel_sseu_ss_mask_t
intel_sseu_get_compute_subslices(const struct sseu_dev_info *sseu);
void intel_sseu_info_init(struct intel_gt *gt);
......@@ -143,6 +168,15 @@ void intel_sseu_print_topology(struct drm_i915_private *i915,
const struct sseu_dev_info *sseu,
struct drm_printer *p);
u16 intel_slicemask_from_dssmask(u64 dss_mask, int dss_per_slice);
u16 intel_slicemask_from_xehp_dssmask(intel_sseu_ss_mask_t dss_mask, int dss_per_slice);
int intel_sseu_copy_eumask_to_user(void __user *to,
const struct sseu_dev_info *sseu);
int intel_sseu_copy_ssmask_to_user(void __user *to,
const struct sseu_dev_info *sseu);
void intel_sseu_print_ss_info(const char *type,
const struct sseu_dev_info *sseu,
struct seq_file *m);
#endif /* __INTEL_SSEU_H__ */
......@@ -4,6 +4,7 @@
* Copyright © 2020 Intel Corporation
*/
#include <linux/bitmap.h>
#include <linux/string_helpers.h>
#include "i915_drv.h"
......@@ -11,14 +12,6 @@
#include "intel_gt_regs.h"
#include "intel_sseu_debugfs.h"
static void sseu_copy_subslices(const struct sseu_dev_info *sseu,
int slice, u8 *to_mask)
{
int offset = slice * sseu->ss_stride;
memcpy(&to_mask[offset], &sseu->subslice_mask[offset], sseu->ss_stride);
}
static void cherryview_sseu_device_status(struct intel_gt *gt,
struct sseu_dev_info *sseu)
{
......@@ -41,7 +34,7 @@ static void cherryview_sseu_device_status(struct intel_gt *gt,
continue;
sseu->slice_mask = BIT(0);
sseu->subslice_mask[0] |= BIT(ss);
sseu->subslice_mask.hsw[0] |= BIT(ss);
eu_cnt = ((sig1[ss] & CHV_EU08_PG_ENABLE) ? 0 : 2) +
((sig1[ss] & CHV_EU19_PG_ENABLE) ? 0 : 2) +
((sig1[ss] & CHV_EU210_PG_ENABLE) ? 0 : 2) +
......@@ -92,7 +85,7 @@ static void gen11_sseu_device_status(struct intel_gt *gt,
continue;
sseu->slice_mask |= BIT(s);
sseu_copy_subslices(&info->sseu, s, sseu->subslice_mask);
sseu->subslice_mask.hsw[s] = info->sseu.subslice_mask.hsw[s];
for (ss = 0; ss < info->sseu.max_subslices; ss++) {
unsigned int eu_cnt;
......@@ -147,21 +140,17 @@ static void gen9_sseu_device_status(struct intel_gt *gt,
sseu->slice_mask |= BIT(s);
if (IS_GEN9_BC(gt->i915))
sseu_copy_subslices(&info->sseu, s,
sseu->subslice_mask);
sseu->subslice_mask.hsw[s] = info->sseu.subslice_mask.hsw[s];
for (ss = 0; ss < info->sseu.max_subslices; ss++) {
unsigned int eu_cnt;
u8 ss_idx = s * info->sseu.ss_stride +
ss / BITS_PER_BYTE;
if (IS_GEN9_LP(gt->i915)) {
if (!(s_reg[s] & (GEN9_PGCTL_SS_ACK(ss))))
/* skip disabled subslice */
continue;
sseu->subslice_mask[ss_idx] |=
BIT(ss % BITS_PER_BYTE);
sseu->subslice_mask.hsw[s] |= BIT(ss);
}
eu_cnt = eu_reg[2 * s + ss / 2] & eu_mask[ss % 2];
......@@ -188,8 +177,7 @@ static void bdw_sseu_device_status(struct intel_gt *gt,
if (sseu->slice_mask) {
sseu->eu_per_subslice = info->sseu.eu_per_subslice;
for (s = 0; s < fls(sseu->slice_mask); s++)
sseu_copy_subslices(&info->sseu, s,
sseu->subslice_mask);
sseu->subslice_mask.hsw[s] = info->sseu.subslice_mask.hsw[s];
sseu->eu_total = sseu->eu_per_subslice *
intel_sseu_subslice_total(sseu);
......@@ -208,7 +196,6 @@ static void i915_print_sseu_info(struct seq_file *m,
const struct sseu_dev_info *sseu)
{
const char *type = is_available_info ? "Available" : "Enabled";
int s;
seq_printf(m, " %s Slice Mask: %04x\n", type,
sseu->slice_mask);
......@@ -216,10 +203,7 @@ static void i915_print_sseu_info(struct seq_file *m,
hweight8(sseu->slice_mask));
seq_printf(m, " %s Subslice Total: %u\n", type,
intel_sseu_subslice_total(sseu));
for (s = 0; s < fls(sseu->slice_mask); s++) {
seq_printf(m, " %s Slice%i subslices: %u\n", type,
s, intel_sseu_subslices_per_slice(sseu, s));
}
intel_sseu_print_ss_info(type, sseu, m);
seq_printf(m, " %s EU Total: %u\n", type,
sseu->eu_total);
seq_printf(m, " %s EU Per Subslice: %u\n", type,
......
This diff is collapsed.
......@@ -976,6 +976,7 @@ static int __igt_reset_engines(struct intel_gt *gt,
{
struct i915_gpu_error *global = &gt->i915->gpu_error;
struct intel_engine_cs *engine, *other;
struct active_engine *threads;
enum intel_engine_id id, tmp;
struct hang h;
int err = 0;
......@@ -996,8 +997,11 @@ static int __igt_reset_engines(struct intel_gt *gt,
h.ctx->sched.priority = 1024;
}
threads = kmalloc_array(I915_NUM_ENGINES, sizeof(*threads), GFP_KERNEL);
if (!threads)
return -ENOMEM;
for_each_engine(engine, gt, id) {
struct active_engine threads[I915_NUM_ENGINES] = {};
unsigned long device = i915_reset_count(global);
unsigned long count = 0, reported;
bool using_guc = intel_engine_uses_guc(engine);
......@@ -1016,7 +1020,7 @@ static int __igt_reset_engines(struct intel_gt *gt,
break;
}
memset(threads, 0, sizeof(threads));
memset(threads, 0, sizeof(*threads) * I915_NUM_ENGINES);
for_each_engine(other, gt, tmp) {
struct task_struct *tsk;
......@@ -1236,6 +1240,7 @@ static int __igt_reset_engines(struct intel_gt *gt,
break;
}
}
kfree(threads);
if (intel_gt_is_wedged(gt))
err = -EIO;
......
......@@ -122,6 +122,12 @@ enum slpc_param_id {
SLPC_MAX_PARAM = 32,
};
enum slpc_media_ratio_mode {
SLPC_MEDIA_RATIO_MODE_DYNAMIC_CONTROL = 0,
SLPC_MEDIA_RATIO_MODE_FIXED_ONE_TO_ONE = 1,
SLPC_MEDIA_RATIO_MODE_FIXED_ONE_TO_TWO = 2,
};
enum slpc_event_id {
SLPC_EVENT_RESET = 0,
SLPC_EVENT_SHUTDOWN = 1,
......
......@@ -310,8 +310,8 @@ static u32 guc_ctl_wa_flags(struct intel_guc *guc)
if (IS_DG2(gt->i915))
flags |= GUC_WA_DUAL_QUEUE;
/* Wa_22011802037: graphics version 12 */
if (GRAPHICS_VER(gt->i915) == 12)
/* Wa_22011802037: graphics version 11/12 */
if (IS_GRAPHICS_VER(gt->i915, 11, 12))
flags |= GUC_WA_PRE_PARSER;
/* Wa_16011777198:dg2 */
......@@ -327,6 +327,10 @@ static u32 guc_ctl_wa_flags(struct intel_guc *guc)
IS_DG2_GRAPHICS_STEP(gt->i915, G11, STEP_A0, STEP_FOREVER))
flags |= GUC_WA_CONTEXT_ISOLATION;
/* Wa_16015675438 */
if (!RCS_MASK(gt))
flags |= GUC_WA_RCS_REGS_IN_CCS_REGS_LIST;
return flags;
}
......
......@@ -230,6 +230,14 @@ struct intel_guc {
* @shift: Right shift value for the gpm timestamp
*/
u32 shift;
/**
* @last_stat_jiffies: jiffies at last actual stats collection time
* We use this timestamp to ensure we don't oversample the
* stats because runtime power management events can trigger
* stats collection at much higher rates than required.
*/
unsigned long last_stat_jiffies;
} timestamp;
#ifdef CONFIG_DRM_I915_SELFTEST
......
......@@ -7,6 +7,7 @@
#include "gt/intel_engine_regs.h"
#include "gt/intel_gt.h"
#include "gt/intel_gt_mcr.h"
#include "gt/intel_gt_regs.h"
#include "gt/intel_lrc.h"
#include "gt/shmem_utils.h"
......@@ -313,7 +314,7 @@ static long __must_check guc_mmio_reg_add(struct intel_gt *gt,
* tracking, it is easier to just program the default steering for all
* regs that don't need a non-default one.
*/
intel_gt_get_valid_steering_for_reg(gt, reg, &group, &inst);
intel_gt_mcr_get_nonterminated_steering(gt, reg, &group, &inst);
entry.flags |= GUC_REGSET_STEERING(group, inst);
slot = __mmio_reg_add(regset, &entry);
......@@ -457,7 +458,7 @@ static void fill_engine_enable_masks(struct intel_gt *gt,
{
info_map_write(info_map, engine_enabled_masks[GUC_RENDER_CLASS], RCS_MASK(gt));
info_map_write(info_map, engine_enabled_masks[GUC_COMPUTE_CLASS], CCS_MASK(gt));
info_map_write(info_map, engine_enabled_masks[GUC_BLITTER_CLASS], 1);
info_map_write(info_map, engine_enabled_masks[GUC_BLITTER_CLASS], BCS_MASK(gt));
info_map_write(info_map, engine_enabled_masks[GUC_VIDEO_CLASS], VDBOX_MASK(gt));
info_map_write(info_map, engine_enabled_masks[GUC_VIDEOENHANCE_CLASS], VEBOX_MASK(gt));
}
......
......@@ -105,6 +105,7 @@
#define GUC_WA_PRE_PARSER BIT(14)
#define GUC_WA_HOLD_CCS_SWITCHOUT BIT(17)
#define GUC_WA_POLLCS BIT(18)
#define GUC_WA_RCS_REGS_IN_CCS_REGS_LIST BIT(21)
#define GUC_CTL_FEATURE 2
#define GUC_CTL_ENABLE_SLPC BIT(2)
......
......@@ -94,9 +94,9 @@ static int guc_hwconfig_fill_buffer(struct intel_guc *guc, struct intel_hwconfig
static bool has_table(struct drm_i915_private *i915)
{
if (IS_ALDERLAKE_P(i915))
if (IS_ALDERLAKE_P(i915) && !IS_ADLP_N(i915))
return true;
if (IS_DG2(i915))
if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 55))
return true;
return false;
......
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment