- 14 Aug, 2015 40 commits
-
-
Jani Nikula authored
There's so much scaler debugging messages that it makes other debugging hard. Remove them. Signed-off-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Dave Gordon authored
This provides a means of reading status and counts relating to GuC actions and submissions. v2: Remove surplus blank line in output [Chris Wilson] v5: Added GuC per-engine submission & seqno statistics v6: Add per-ring statistics to client, refactor client-dumper. Signed-off-by: Dave Gordon <david.s.gordon@intel.com> Signed-off-by: Alex Dai <yu.dai@intel.com> Reviewed-by: Tom O'Rourke <Tom.O'Rourke@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Alex Dai authored
GuC-based submission is mostly the same as execlist mode, up to intel_logical_ring_advance_and_submit(), where the context being dispatched would be added to the execlist queue; at this point we submit the context to the GuC backend instead. There are, however, a few other changes also required, notably: 1. Contexts must be pinned at GGTT addresses accessible by the GuC i.e. NOT in the range [0..WOPCM_SIZE), so we have to add the PIN_OFFSET_BIAS flag to the relevant GGTT-pinning calls. 2. The GuC's TLB must be invalidated after a context is pinned at a new GGTT address. 3. GuC firmware uses the one page before Ring Context as shared data. Therefore, whenever driver wants to get base address of LRC, we will offset one page for it. LRC_PPHWSP_PN is defined as the page number of LRCA. 4. In the work queue used to pass requests to the GuC, the GuC firmware requires the ring-tail-offset to be represented as an 11-bit value, expressed in QWords. Therefore, the ringbuffer size must be reduced to the representable range (4 pages). v2: Defer adding #defines until needed [Chris Wilson] Rationalise type declarations [Chris Wilson] v4: Squashed kerneldoc patch into here [Daniel Vetter] v5: Update request->tail in code common to both GuC and execlist modes. Add a private version of lr_context_update(), as sharing the execlist version leads to race conditions when the CPU and the GuC both update TAIL in the context image. Conversion of error-captured HWS page to string must account for offset from start of object to actual HWS (LRC_PPHWSP_PN). Issue: VIZ-4884 Signed-off-by: Alex Dai <yu.dai@intel.com> Signed-off-by: Dave Gordon <david.s.gordon@intel.com> Reviewed-by: Tom O'Rourke <Tom.O'Rourke@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Dave Gordon authored
Turn on interrupt steering to route necessary interrupts to GuC. v6: Rebased Issue: VIZ-4884 Signed-off-by: Alex Dai <yu.dai@intel.com> Signed-off-by: Dave Gordon <david.s.gordon@intel.com> Reviewed-by: Tom O'Rourke <Tom.O'Rourke@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Dave Gordon authored
A GuC client has its own doorbell and workqueue. It maintains the doorbell cache line, process description object and work queue item. A default guc_client is created for the i915 driver to use for normal-priority in-order submission. Note that the created client is not yet ready for use; doorbell allocation will fail as we haven't yet linked the GuC's context descriptor to the default contexts for each ring (see later patch). v2: Defer adding structure members until needed [Chris Wilson] Rationalise type declarations [Chris Wilson] v5: Add GuC per-engine submission & seqno statistics. Move wq locking to encompass both get_space() and add_item(). Take forcewake lock in host2guc_action() [Tom O'Rourke] v6: Fix GuC doorbell cacheline selection code (the cacheline-within-page calculation was wrong). Rename GuC priorities to make them closer to the names used in the GuC firmware source, matching what the autogenerated versions will (probably) be. Add per-ring statistics to client. Issue: VIZ-4884 Signed-off-by: Alex Dai <yu.dai@intel.com> Signed-off-by: Dave Gordon <david.s.gordon@intel.com> Reviewed-by: Tom O'Rourke <Tom.O'Rourke@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Alex Dai authored
Allocate a GEM object to hold GuC log data. A debugfs interface (i915_guc_log_dump) is provided to print out the log content. v2: Add struct members at point of use [Chris Wilson] v6: Rebased Issue: VIZ-4884 Signed-off-by: Alex Dai <yu.dai@intel.com> Signed-off-by: Dave Gordon <david.s.gordon@intel.com> Reviewed-by: Tom O'Rourke <Tom.O'Rourke@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Alex Dai authored
This adds the first of the data structures used to communicate with the GuC (the pool of guc_context structures). We create a GuC-specific wrapper round the GEM object allocator as all GEM objects shared with the GuC must be pinned into GGTT space at an address that is NOT in the range [0..WOPCM_TOP), as that range of GGTT addresses is not accessible to the GuC (from the GuC's point of view, it's permanently reserved for other objects such as the BootROM & SRAM). Later, we will need to allocate additional GuC-sharable objects for the submission client(s) and the GuC's debug log. v2: Remove redundant initialisation [Chris Wilson] Defer adding struct members until needed [Chris Wilson] Local functions should pass dev_priv rather than dev [Chris Wilson] v5: Invalidate GuC TLB after allocating and pinning a new object v6: Rebased Issue: VIZ-4884 Signed-off-by: Alex Dai <yu.dai@intel.com> Signed-off-by: Dave Gordon <david.s.gordon@intel.com> Reviewed-by: Tom O'Rourke <Tom.O'Rourke@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Dave Gordon authored
GuC submission is basically execlist submission, but with the GuC handling the actual writes to the ELSP and the resulting context switch interrupts. So to describe a context for submission via the GuC, we need one of the same functions used in execlist mode. This commit exposes one such function, changing its name to better describe what it does (it's related to logical ring contexts rather than to execlists per se). v2: Replaces previous "drm/i915: Move execlists defines from .c to .h" v3: Incorporates a change to one of the functions exposed here that was previously part of an internal patch, but which was omitted from the version recently committed to drm-intel-nightly: 7a01a0a2 drm/i915/lrc: Update PDPx registers with lri commands So we reinstate this change here. v4: Drop v3 change, update function parameters due to collision with 8ee36152 drm/i915: Convert execlists_ctx_descriptor() for requests v5: Don't expose execlists_update_context() after all. The current version is no longer compatible with GuC submission; trying to share the execlist version of this function results in both GuC and CPU updating TAIL in the context image, with bad results when they get out of step. The GuC submission path now has its own private version that just updates the ringbuffer start address, and not TAIL or PDPx. v6: Rebased Issue: VIZ-4884 Signed-off-by: Dave Gordon <david.s.gordon@intel.com> Reviewed-by: Tom O'Rourke <Tom.O'Rourke@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Alex Dai authored
The new node provides access to the status of the GuC-specific loader; also the scratch registers used for communication between the i915 driver and the GuC firmware. v2: Changes to output formats per Chris Wilson's suggestions v6: Rebased Issue: VIZ-4884 Signed-off-by: Alex Dai <yu.dai@intel.com> Signed-off-by: Dave Gordon <david.s.gordon@intel.com> Reviewed-by: Tom O'Rourke <Tom.O'Rourke@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Alex Dai authored
This fetches the required firmware image from the filesystem, then loads it into the GuC's memory via a dedicated DMA engine. This patch is derived from GuC loading work originally done by Vinit Azad and Ben Widawsky. v2: Various improvements per review comments by Chris Wilson v3: Removed 'wait' parameter to intel_guc_ucode_load() as firmware prefetch is no longer supported in the common firmware loader, per Daniel Vetter's request. Firmware checker callback fn now returns errno rather than bool. v4: Squash uC-independent code into GuC-specifc loader [Daniel Vetter] Don't keep the driver working (by falling back to execlist mode) if GuC firmware loading fails [Daniel Vetter] v5: Clarify WOPCM-related #defines [Tom O'Rourke] Delete obsolete code no longer required with current h/w & f/w [Tom O'Rourke] Move the call to intel_guc_ucode_init() later, so that it can allocate GEM objects, and have it fetch the firmware; then intel_guc_ucode_load() doesn't need to fetch it later. [Daniel Vetter]. v6: Update comment describing intel_guc_ucode_load() [Tom O'Rourke] Issue: VIZ-4884 Signed-off-by: Alex Dai <yu.dai@intel.com> Signed-off-by: Dave Gordon <david.s.gordon@intel.com> Reviewed-by: Tom O'Rourke <Tom.O'Rourke@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Ville Syrjälä authored
We only need the link_bw/rate_select parameters when starting link training, and they should be computed based on the currently active config, so throw them out from intel_dp and just compute on demand. Toss in an extra debug print to see rate_select in addition to link_bw, as the latter may be 0 for eDP 1.4. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Sivakumar Thulasimani <sivakumar.thulasimani@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Ville Syrjälä authored
intel_dp->link_bw is going away, so consul the port_clock instead when choosing between TP1 and TP3. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Sivakumar Thulasimani <sivakumar.thulasimani@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Ville Syrjälä authored
Currently we clobber intel_dp->lane_count in compute config, which means after a rejected modeset we may no longer be able to retrain the current link. Move lane_count into pipe_config to avoid that. v2: Add missing ':' to the pipe config debug dump Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Sivakumar Thulasimani <sivakumar.thulasimani@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Ville Syrjälä authored
Use a separate variable for the TRANS_DP_CTL value instead of reusing 'tmp' that otherwise contains the DP port register value. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Sivakumar Thulasimani <sivakumar.thulasimani@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Ville Syrjälä authored
All the *_ddi_pll_select() functions get passed the port_clock and pipe config as parameters. We only need to pass the pipe config, and the functions can dig up the port_clock themselves. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Sivakumar Thulasimani <sivakumar.thulasimani@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Ville Syrjälä authored
Use port_clock instead of link_bw when picking the PLL parameters for DP. link_bw may be zero with an eDP 1.4 sink that supports DP_LINK_RATE_SET so we shouldn't use it for anything other than feed it to the sink appropriately. v2: Fix typo in commit message (Sivakumar) Reviewed-by: Sivakumar Thulasimani <sivakumar.thulasimani@intel.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Ville Syrjälä authored
Currently we treat intel_{dp,hdmi}->color_range as partly user controller value (via the property) but we also change it during .compute_config() when using the "Automatic" mode. That is a bit confusing, so let's just change things so that we store the user property values in intel_dp, and only change what's stored in pipe_config during .compute_config(). There should be no functional change. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Sivakumar Thulasimani <sivakumar.thulasimani@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Chris Wilson authored
When we queue the command or operation to change the scanout address, we mark the flip as in progress. We can use this flag to prevent us from checking for a stalled flip prior to its existence! Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Ville Syrjälä authored
Currently we don't clflush on pin_to_display if the bo is already UC/WT and is not in the CPU write domain. This causes problems with pwrite since pwrite doesn't change the write domain, and it avoids clflushing on UC/WT buffers on LLC platforms unless the buffer is currently being scanned out. Fix the problem by marking the cache dirty and adjusting i915_gem_object_set_cache_level() to clflush when the cache is dirty even if the cache_level doesn't change. My last attempt [1] at fixing this via write domain frobbing was shot down, but now with the cache_dirty flag we can do things in a nicer way. [1] http://lists.freedesktop.org/archives/intel-gfx/2014-November/055390.html v2: Drop the I915_CACHE_NONE/WT checks from pwrite Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86422 Testcase: igt/kms_pwrite_crc Testcase: igt/gem_pwrite_snooped Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Sonika Jindal authored
WA for BXT A0/A1, where DDIB's HPD pin is swapped to DDIA, so enabling DDIA HPD pin in place of DDIB. v2: For DP, irq_port is used to determine the encoder instead of hpd_pin and removing the edp HPD logic because port A HPD is not present(Imre) v3: Rebased on top of Imre's patchset for enabling HPD on PORT A. Added hpd_pin swapping for intel_dp_init_connector, setting encoder for PORT_A as per the WA in irq_port (Imre) v4: Dont enable interrupt for edp, also reframe the description (Siva) v5: Don’t check for PORT_A in intel_ddi_init to update dig_port, instead avoid setting hpd_pin itself (Imre) Signed-off-by: Sonika Jindal <sonika.jindal@intel.com> Reviewed-by: Sivakumar Thulasimani <sivakumar.thulasimani@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Sonika Jindal authored
Also remove redundant comments. Signed-off-by: Sonika Jindal <sonika.jindal@intel.com> Reviewed-by: Sivakumar Thulasimani <sivakumar.thulasimani@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Michel Thierry authored
And fix 0-DAY kernel test infrastructure warning. Reported-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Michel Thierry <michel.thierry@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Michel Thierry authored
With the offset length being taken care of in ("drm/i915/gtt: Allow >= 4GB offsets in X86_32"), the code should be finally safe in 32-bit kernels. This reverts commit 501fd70f Author: Michel Thierry <michel.thierry@intel.com> Date: Fri May 29 14:15:05 2015 +0100 drm/i915: limit PPGTT size to 2GB in 32-bit platforms Signed-off-by: Michel Thierry <michel.thierry@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Michel Thierry authored
Similar to commit c44ef60e ("drm/i915/gtt: Allow >= 4GB sizes for vm"), i915_gem_obj_offset and i915_gem_obj_ggtt_offset return an unsigned long, which in only 4-bytes long in 32-bit kernels. Change return type (and other related offset variables) to u64. Since Global GTT is always limited to 4GB, this change would not be required in i915_gem_obj_ggtt_offset, but this is done for consistency. v2: Remove unnecessary offset variable in do_pin, as we already have vma->node.start (Chris). Update GGTT offset too (Tvrtko). Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Signed-off-by: Michel Thierry <michel.thierry@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Rodrigo Vivi authored
By Vesa DP 1.2 spec TEST_CRC_COUNT is a "4 bit wrap counter which increments each time the TEST_CRC_x_x are updated." However if we are trying to verify the screen hasn't changed we get same (count, crc) pair twice. Without this patch we would return -ETIMEOUT in this case. So, if in 6 vblanks the pair (count, crc) hasn't changed we return it anyway instead of returning error and let test case decide if it was right or not. Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Rodrigo Vivi authored
By Vesa DP 1.2 Spec TEST_CRC_COUNT should be "reset to 0 when TEST_SINK bit 0 = 0." However for some strange reason when PSR is enabled in certain platforms this is not true. At least not immediatelly. So we face cases like this: first get_sink_crc operation: count: 0, crc: 000000000000 count: 1, crc: c101c101c101 returned expected crc: c101c101c101 secont get_sink_crc operation: count: 1, crc: c101c101c101 count: 0, crc: 000000000000 count: 1, crc: 0000c1010000 should return expected crc: 0000c1010000 But also the reset to 0 should be faster resulting into: get_sink_crc operation: count: 1, crc: c101c101c101 count: 1, crc: 0000c1010000 should return expected crc: 0000c1010000 So in order to know that the second one is valid one we need to compare the pair (count, crc) with latest (count, crc). If the pair changed you have your valid CRC. Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Rodrigo Vivi authored
By Vesa DP spec, test counter at DP_TEST_SINK_MISC just reset to 0 when unsetting DP_TEST_SINK_START, so let's force this stop here. But let's minimize the aux transactions and just do it when we know it hasn't been properly stoped. Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Michel Thierry authored
GTT was only 32b and its max value is 4GB. In order to allow objects bigger than 4GB in 48b PPGTT, i915_gem_userptr_ioctl we could check against max 48b range (1ULL << 48). But since the check no longer applies, just kill the limit. v2: Use the default ctx to infer the ppgtt max size (Akash). v3: Just kill the limit, it was only there for early detection of an error when used for execbuffer (Chris). Cc: Akash Goel <akash.goel@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Michel Thierry <michel.thierry@intel.com> Reviewed-by: Akash Goel <akash.goel@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Michel Thierry authored
Otherwise it can overflow in 48-bit mode, and cause an incorrect exec_start. Before commit 5f19e2bf ("drm/i915: Merged the many do_execbuf() parameters into a structure"), it was already an u64. Signed-off-by: Michel Thierry <michel.thierry@intel.com> Reviewed-by: Akash Goel <akash.goel@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Michel Thierry authored
In a 48b world, users can try to allocate buffers bigger than 4GB; in these cases it is important that size is a 64b variable. v2: Drop the warning about bind with size 0, it shouldn't happen anyway. Signed-off-by: Michel Thierry <michel.thierry@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Michel Thierry authored
v2: Clean up patch after rebases. v3: gen8_dump_ppgtt for 32b and 48b PPGTT. v4: Use used_pml4es/pdpes (Akash). v5: Rebase after Mika's ppgtt cleanup / scratch merge patch series. v6: Rely on used_px bits instead of null checking (Akash) Cc: Akash Goel <akash.goel@intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+) Reviewed-by: Akash Goel <akash.goel@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Michel Thierry authored
v2: For semaphore errors, object is mapped to GGTT and offset will not be > 4GB, print only lower 32-bits (Akash) v3: Print gtt_offset in groups of 32-bit (Chris) Cc: Akash Goel <akash.goel@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Michel Thierry <michel.thierry@intel.com> Reviewed-by: Akash Goel <akash.goel@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Michel Thierry authored
Similar to PDs, while setting up a page directory pointer, make all entries of the pdp point to the scratch pd before mapping (and make all its entries point to the scratch page); this is to be safe in case of out of bound access or proactive prefetch. Also add a scratch pdp, which the PML4 entries point to. v2: Handle scratch_pdp allocation failure correctly, and keep initialize_px functions together (Akash) v3: Rebase after Mika's ppgtt cleanup / scratch merge patch series. Rely on the added macros to initialize the pdps. v4: Rebase after final merged version of Mika's ppgtt/scratch patches (and removed commit message part related to v3). v5: Update commit message to also mention PML4 table initialization and the new scratch pdp (Akash). Suggested-by: Akash Goel <akash.goel@intel.com> Signed-off-by: Michel Thierry <michel.thierry@intel.com> Reviewed-by: Akash Goel <akash.goel@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Michel Thierry authored
When 48b is enabled, gen8_ppgtt_insert_entries needs to read the Page Map Level 4 (PML4), before it selects which Page Directory Pointer (PDP) it will write to. Similarly, gen8_ppgtt_clear_range needs to get the correct PDP/PD range. This patch was inspired by Ben's "Depend exclusively on map and unmap_vma". v2: Rebase after s/page_tables/page_table/. v3: Remove unnecessary pdpe loop in gen8_ppgtt_clear_range_4lvl and use clamp_pdp in gen8_ppgtt_insert_entries (Akash). v4: Merge gen8_ppgtt_clear_range_4lvl into gen8_ppgtt_clear_range to maintain symmetry with gen8_ppgtt_insert_entries (Akash). v5: Do not mix pages and bytes in insert_entries (Akash). v6: Prevent overflow in sg_nents << PAGE_SHIFT, when inserting 4GB at once. v7: Rebase after Mika's ppgtt cleanup / scratch merge patch series. Use gen8_px_index functions, and remove unnecessary number of pages parameter in insert_pte_entries. v8: Change gen8_ppgtt_clear_pte_range to stop at PDP boundary, instead of adding and extra clamp function; remove unnecessary pdp_start/pdp_len variables (Akash). v9: pages->orig_nents instead of sg_nents(pages->sgl) to get the length (Akash). v10: Remove pdp warning check ingen8_ppgtt_insert_pte_entries until this commit (Akash). Reviewed-by: Akash Goel <akash.goel@intel.com> (v9) Cc: Akash Goel <akash.goel@intel.com> Signed-off-by: Michel Thierry <michel.thierry@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Michel Thierry authored
As a step towards implementing 4 levels, while not discarding the existing pte insert functions, we need to pass the sg_iter through. The current function understands to the page directory granularity. An object's pages may span the page directory, and so using the iter directly as we write the PTEs allows the iterator to stay coherent through a VMA insert operation spanning multiple page table levels. v2: Rebase after s/page_tables/page_table/. v3: Rebase after Mika's ppgtt cleanup / scratch merge patch series; updated commit message (s/map/insert). v4: Rebase. Reviewed-by: Akash Goel <akash.goel@intel.com> (v3) Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+) Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Michel Thierry authored
In 64b (48bit canonical) PPGTT addressing, the PDP0 register contains the base address to PML4, while the other PDP registers are ignored. In LRC, the addressing mode must be specified in every context descriptor, and the base address to PML4 is stored in the reg state. v2: PML4 update in legacy context switch is left for historic reasons, the preferred mode of operation is with lrc context based submission. v3: s/gen8_map_page_directory/gen8_setup_page_directory and s/gen8_map_page_directory_pointer/gen8_setup_page_directory_pointer. Also, clflush will be needed for bxt. (Akash) v4: Squashed lrc-specific code and use a macro to set PML4 register. v5: Rebase after Mika's ppgtt cleanup / scratch merge patch series. PDP update in bb_start is only for legacy 32b mode. v6: Rebase after final merged version of Mika's ppgtt/scratch patches. v7: There is no need to update the pml4 register value in execlists_update_context. (Akash) v8: Move pd and pdp setup functions to a previous patch, they do not belong here. (Akash) v9: Check USES_FULL_48BIT_PPGTT instead of GEN8_CTX_ADDRESSING_MODE in gen8_emit_bb_start to check if emit pdps is needed. (Akash) Cc: Akash Goel <akash.goel@intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+) Reviewed-by: Akash Goel <akash.goel@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Michel Thierry authored
PML4 has no special attributes, and there will always be a PML4. So simply initialize it at creation, and destroy it at the end. The code for 4lvl is able to call into the existing 3lvl page table code to handle all of the lower levels. v2: Return something at the end of gen8_alloc_va_range_4lvl to keep the compiler happy. And define ret only in one place. Updated gen8_ppgtt_unmap_pages and gen8_ppgtt_free to handle 4lvl. v3: Use i915_dma_unmap_single instead of pci API. Fix a couple of incorrect checks when unmapping pdp and pd pages (Akash). v4: Call __pdp_fini also for 32b PPGTT. Clean up alloc_pdp param list. v5: Prevent (harmless) out of range access in gen8_for_each_pml4e. v6: Simplify alloc_vma_range_4lvl and gen8_ppgtt_init_common error paths. (Akash) v7: Rebase, s/gen8_ppgtt_free_*/gen8_ppgtt_cleanup_*/. v8: Change location of pml4_init/fini. It will make next patches cleaner. v9: Rebase after Mika's ppgtt cleanup / scratch merge patch series, while trying to reuse as much as possible for pdp alloc. pml4_init/fini replaced by setup/cleanup_px macros. v10: Rebase after Mika's merged ppgtt cleanup patch series. v11: Rebase after final merged version of Mika's ppgtt/scratch patches. v12: Fix pdpe start value in trace (Akash) v13: Define all 4lvl functions in this patch directly, instead of previous patches, add i915_page_directory_pointer_entry_alloc here, use test_bit to detect when pdp is already allocated (Akash). v14: Move pdp allocation into a new gen8_ppgtt_alloc_page_dirpointers funtion, as we do for pds and pts; move pd and pdp setup functions to this patch (Akash). v15: Added kfree(pdp) from previous patch to this (Akash). Cc: Akash Goel <akash.goel@intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2+) Reviewed-by: Akash Goel <akash.goel@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Michel Thierry authored
Introduces the Page Map Level 4 (PML4), ie. the new top level structure of the page tables. To facilitate testing, 48b mode will be available on Broadwell and GEN9+, when i915.enable_ppgtt = 3. v2: Remove unnecessary CONFIG_X86_64 checks, ppgtt code is already 32/64-bit safe (Chris). v3: Add goto free_scratch in temp 48-bit mode init code (Akash). v4: kfree the pdp until the 4lvl alloc/free patch (Akash). v5: Postpone 48-bit code in sanitize_enable_ppgtt (Akash). v6: Keep _insert_pte_entries changes outside this patch (Akash). Cc: Akash Goel <akash.goel@intel.com> Signed-off-by: Michel Thierry <michel.thierry@intel.com> Reviewed-by: Akash Goel <akash.goel@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Michel Thierry authored
The dynamic page allocation patch series added it for GEN6, this patch adds them for GEN8. v2: Consolidate pagetable/page_directory events v3: Multiple rebases. v4: Rebase after s/page_tables/page_table/. v5: Rebase after Mika's ppgtt cleanup / scratch merge patch series. v6: Rebase after gen8_map_pagetable_range removal. v7: Use generic page name (px) in DECLARE_EVENT_CLASS (Akash) v8: Defer define of i915_page_directory_pointer_entry_alloc (Akash) Cc: Akash Goel <akash.goel@intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v3+) Reviewed-by: Akash Goel <akash.goel@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-
Michel Thierry authored
The insert_entries function was the function used to write PTEs. For the PPGTT it was "hardcoded" to only understand two level page tables, which was the case for GEN7. We can reuse this for 4 level page tables, and remove the concept of insert_entries, which was never viable past 2 level page tables anyway, but it requires a bit of rework to make the function a bit more generic. v2: Rebase after Mika's ppgtt cleanup / scratch merge patch series. v3: Rebase after final merged version of Mika's ppgtt/scratch patches. v4: Check and warn for NULL value of pdp pointer (Akash). Cc: Akash Goel <akash.goel@intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Signed-off-by: Michel Thierry <michel.thierry@intel.com> (v2) Reviewed-by: Akash Goel <akash.goel@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
-