Commits · 8eb80946ab0c18a853be5f90d6b6ccbe3fd42989 · Kirill Smelkov / linux

09 Nov, 2023 4 commits

drm/edid: split out drm_eld.h from drm_edid.h · 8eb80946

Jani Nikula authored Oct 31, 2023

The drm_edid.[ch] files are starting to be a bit crowded, and with plans
to add more ELD related functionality, it's perhaps cleanest to split
the ELD code out to a header of its own.

Include drm_eld.h from drm_edid.h for starters, and leave it to
follow-up work to only include drm_eld.h where needed.

Cc: Mitul Golani <mitulkumar.ajitkumar.golani@intel.com>
Reviewed-by: Mitul Golani <mitulkumar.ajitkumar.golani@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/0c6d631fa1058036d72dd25d1cabc90a7c52490e.1698747331.git.jani.nikula@intel.com

8eb80946

MAINTAINERS: Add Maira to V3D maintainers · c400eb4d

Maíra Canal authored Nov 06, 2023

I've been contributing to V3D with improvements, reviews, testing and
debugging. Therefore, add myself as a co-maintainer of the V3D driver.
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Acked-by: Melissa Wen <mwen@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231106134201.725805-1-mcanal@igalia.com

c400eb4d

drm/todo: Add entry to clean up former seltests suites · a0a0bd3e

Maxime Ripard authored Oct 25, 2023

Most of those suites are undocumented and aren't really clear about what
they are testing. Let's add a TODO entry as a future task to get started
into KUnit and DRM.
Acked-by: Daniel Vetter <daniel@ffwll.ch>
Link: https://lore.kernel.org/r/20231025132428.723672-2-mripard@kernel.orgSigned-off-by: Maxime Ripard <mripard@kernel.org>

a0a0bd3e

drm/tests: Remove slow tests · 078a5b49

Maxime Ripard authored Oct 25, 2023

Both the drm_buddy and drm_mm tests have been converted from selftest to
kunit recently.

However, a significant portion of them are trying to exert some part of
their API over a huge number of iterations and with random variations of
their parameters.  They are thus more a way to discover new bugs than
actual unit tests.

This is fine in itself but leads to very slow runtime (up to 25s for
some drm_test_mm_reserve and drm_test_mm_insert on a Ryzen 7950x while
running the tests in qemu) which make them a poor fit for kunit.

Let's remove those tests from the drm_mm and drm_buddy test suites for
now, and if it's ever needed we can always create proper unit tests for
them later on.

This made the entire DRM tests execution time (as of v6.6-rc1) come from
65s to less than .5s on a Ryzen 7950x system when running under qemu,
and from 9 minutes to about 4s on a RaspberryPi4.
Acked-by: Daniel Vetter <daniel@ffwll.ch>
Suggested-by: Daniel Vetter <daniel@ffwll.ch>
Link: https://lore.kernel.org/r/20231025132428.723672-1-mripard@kernel.orgSigned-off-by: Maxime Ripard <mripard@kernel.org>

078a5b49

08 Nov, 2023 6 commits

accel/ivpu: Use GEM shmem helper for all buffers · 8d88e4cd

Jacek Lawrynowicz authored Oct 31, 2023

Use struct drm_gem_shmem_object as a base for struct ivpu_bo.
This cuts by 50% the buffer management code.
Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231031073156.1301669-5-stanislaw.gruszka@linux.intel.com

8d88e4cd

accel/ivpu: Remove support for uncached buffers · 48d45fac

Jacek Lawrynowicz authored Oct 31, 2023

Usages of DRM_IVPU_BO_UNCACHED should be replaced by DRM_IVPU_BO_WC.
There is no functional benefit from DRM_IVPU_BO_UNCACHED if these
buffers are never mapped to host VM.

This allows to cut the buffer handling code in the kernel driver
by half.

Usage of DRM_IVPU_BO_UNCACHED buffers was removed from user-space
driver and will not be part of first UMD release.
Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231031073156.1301669-4-stanislaw.gruszka@linux.intel.com

48d45fac

accel/ivpu: Fix locking in ivpu_bo_remove_all_bos_from_context() · 48aea7f2

Jacek Lawrynowicz authored Oct 31, 2023

ivpu_bo_remove_all_bos_from_context() could race with ivpu_bo_free()
when prime buffer was closed after vpu device was closed.

Move the bo_list from context to vdev and use a dedicated lock to
sync it. This list is not modified when BO is added/removed from
a context.

Also rename ivpu_bo_free_vpu_addr() to ivpu_bo_unbind() because this
function does more then just free vpu_addr.
Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231031073156.1301669-3-stanislaw.gruszka@linux.intel.com

48aea7f2

accel/ivpu: Allocate vpu_addr in gem->open() callback · b0352241

Jacek Lawrynowicz authored Oct 31, 2023

Use gem->open() callback to simplify the code and prepare for gem_shmem
conversion. It is called during handle creation for a gem object,
during prime import and in BO_CREATE ioctl. Hence can be used for vpu_addr
allocation. On the way remove unused bo->user_ptr field.
Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231031073156.1301669-2-stanislaw.gruszka@linux.intel.com

b0352241

MAINTAINERS: Drop Emma Anholt from all M lines. · 89d04995

Emma Anholt authored Oct 31, 2023

I am not active in the Linux kernel and don't want to see patches.
Signed-off-by: Emma Anholt <emma@anholt.net>
Acked-by: Maíra Canal <mcanal@igalia.com>
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231031181648.48675-1-emma@anholt.net

89d04995

drm/sched: Don't disturb the entity when in RR-mode scheduling · bc8d6a9d

Luben Tuikov authored Nov 02, 2023

Don't call drm_sched_select_entity() in drm_sched_run_job_queue(). In fact,
rename __drm_sched_run_job_queue() to just drm_sched_run_job_queue(), and let
it do just that, schedule the work item for execution.

The problem is that drm_sched_run_job_queue() calls drm_sched_select_entity()
to determine if the scheduler has an entity ready in one of its run-queues,
and in the case of the Round-Robin (RR) scheduling, the function
drm_sched_rq_select_entity_rr() does just that, selects the _next_ entity
which is ready, sets up the run-queue and completion and returns that
entity. The FIFO scheduling algorithm is unaffected.

Now, since drm_sched_run_job_work() also calls drm_sched_select_entity(), then
in the case of RR scheduling, that would result in drm_sched_select_entity()
having been called twice, which may result in skipping a ready entity if more
than one entity is ready. This commit fixes this by eliminating the call to
drm_sched_select_entity() from drm_sched_run_job_queue(), and leaves it only
in drm_sched_run_job_work().

v2: Rebased on top of Tvrtko's renames series of patches. (Luben)
Add fixes-tag. (Tvrtko)
Signed-off-by: Luben Tuikov <ltuikov89@gmail.com>
Fixes: f7fe64ad ("drm/sched: Split free_job into own work item")
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Danilo Krummrich <dakr@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231107041020.10035-2-ltuikov89@gmail.com

bc8d6a9d

07 Nov, 2023 1 commit

accel/ivpu: Fix compilation with CONFIG_PM=n · c015fb6d

Jacek Lawrynowicz authored Nov 06, 2023

Use pm_runtime_status_suspended() instead of dev->power.runtime_status
field that is not available without PM.
Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Link: https://lore.kernel.org/all/20231106130827.1600948-1-jacek.lawrynowicz@linux.intel.com

c015fb6d

06 Nov, 2023 4 commits

drm/panel: nt35510: fix typo · 27d9620e

Dario Binacchi authored Oct 23, 2023

Replace 'HFP' with 'HBP'.

Fixes: 899f24ed ("drm/panel: Add driver for Novatek NT35510-based panels")
Signed-off-by: Dario Binacchi <dario.binacchi@amarulasolutions.com>
Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20231023090613.1694133-1-dario.binacchi@amarulasolutions.com

27d9620e

drm/v3d: Expose the total GPU usage stats on sysfs · 509433d8

Maíra Canal authored Sep 05, 2023

The previous patch exposed the accumulated amount of active time per
client for each V3D queue. But this doesn't provide a global notion of
the GPU usage.

Therefore, provide the accumulated amount of active time for each V3D
queue (BIN, RENDER, CSD, TFU and CACHE_CLEAN), considering all the jobs
submitted to the queue, independent of the client.

This data is exposed through the sysfs interface, so that if the
interface is queried at two different points of time the usage percentage
of each of the queues can be calculated.
Co-developed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Signed-off-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Acked-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Reviewed-by: Melissa Wen <mwen@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230905213416.1290219-3-mcanal@igalia.com

509433d8

drm/v3d: Implement show_fdinfo() callback for GPU usage stats · 09a93cc4

Maíra Canal authored Sep 05, 2023

This patch exposes the accumulated amount of active time per client
through the fdinfo infrastructure. The amount of active time is exposed
for each V3D queue: BIN, RENDER, CSD, TFU and CACHE_CLEAN.

In order to calculate the amount of active time per client, a CPU clock
is used through the function local_clock(). The point where the jobs has
started is marked and is finally compared with the time that the job had
finished.

Moreover, the number of jobs submitted to each queue is also exposed on
fdinfo through the identifier "v3d-jobs-<queue>".
Co-developed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Signed-off-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Acked-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Reviewed-by: Melissa Wen <mwen@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230905213416.1290219-3-mcanal@igalia.com

09a93cc4

drm: Do not round to megabytes for greater than 1MiB sizes in fdinfo stats · 5faf6e18

Tvrtko Ursulin authored Sep 27, 2023

It is better not to lose precision and not revert to 1 MiB size
granularity for every size greater than 1 MiB.

Sizes in KiB should not be so troublesome to read (and in fact machine
parsing is I expect the norm here), they align with other api like
/proc/meminfo, and they allow writing tests for the interface without
having to embed drm.ko implementation knowledge into them. (Like knowing
that minimum buffer size one can use for successful verification has to be
1MiB aligned, and on top account for any pre-existing memory utilisation
outside of driver's control.)

But probably even more importantly I think that it is just better to show
the accurate sizes and not arbitrary lose precision for a little bit of a
stretched use case of eyeballing fdinfo text directly.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Rob Clark <robdclark@gmail.com>
Cc: Adrián Larumbe <adrian.larumbe@collabora.com>
Cc: steven.price@arm.com
Reviewed-by: Steven Price <steven.price@arm.com>
Link: https://lore.kernel.org/r/20230927133843.247957-2-tvrtko.ursulin@linux.intel.comSigned-off-by: Maxime Ripard <mripard@kernel.org>

5faf6e18

05 Nov, 2023 5 commits

drm/sched: Drop suffix from drm_sched_wakeup_if_can_queue · f12af4c4

Tvrtko Ursulin authored Nov 02, 2023

Because a) helper is exported to other parts of the scheduler and
b) there isn't a plain drm_sched_wakeup to begin with, I think we can
drop the suffix and by doing so separate the intimiate knowledge
between the scheduler components a bit better.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Luben Tuikov <ltuikov89@gmail.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231102105538.391648-6-tvrtko.ursulin@linux.intel.comReviewed-by: Luben Tuikov <ltuikov89@gmail.com>
Signed-off-by: Luben Tuikov <ltuikov89@gmail.com>

f12af4c4

drm/sched: Rename drm_sched_run_job_queue_if_ready and clarify kerneldoc · 35a4279d

Tvrtko Ursulin authored Nov 02, 2023

"If ready" is not immediately clear what it means - is the scheduler
ready or something else? Drop the suffix, clarify kerneldoc, and employ
the same naming scheme as in drm_sched_run_free_queue:

 - drm_sched_run_job_queue   - enqueues if there is something to enqueue
                               *and* scheduler is ready (can queue)
 - __drm_sched_run_job_queue - low-level helper to simply queue the job
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Luben Tuikov <ltuikov89@gmail.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231102105538.391648-5-tvrtko.ursulin@linux.intel.comReviewed-by: Luben Tuikov <ltuikov89@gmail.com>
Signed-off-by: Luben Tuikov <ltuikov89@gmail.com>

35a4279d

drm/sched: Rename drm_sched_free_job_queue to be more descriptive · 67dd1d8c

Tvrtko Ursulin authored Nov 02, 2023

The current name makes it sound like helper will free a queue, while what
it does is it enqueues the free job worker.

Rename it to drm_sched_run_free_queue to align with existing
drm_sched_run_job_queue.

Despite that creating an illusion there are two queues, while in reality
there is only one, at least it creates a consistent naming for the two
enqueuing helpers.

At the same time simplify the "if done" helper by dropping the suffix and
adding a double underscore prefix to the one which just enqueues.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Luben Tuikov <ltuikov89@gmail.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231102105538.391648-4-tvrtko.ursulin@linux.intel.comReviewed-by: Luben Tuikov <ltuikov89@gmail.com>
Signed-off-by: Luben Tuikov <ltuikov89@gmail.com>

67dd1d8c

drm/sched: Move free worker re-queuing out of the if block · e608d9f7

Tvrtko Ursulin authored Nov 02, 2023

Whether or not there are more jobs to clean up does not depend on the
existance of the current job, given both drm_sched_get_finished_job and
drm_sched_free_job_queue_if_done take and drop the job list lock.
Therefore it is confusing to make it read like there is a dependency.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Luben Tuikov <ltuikov89@gmail.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231102105538.391648-3-tvrtko.ursulin@linux.intel.comReviewed-by: Luben Tuikov <ltuikov89@gmail.com>
Signed-off-by: Luben Tuikov <ltuikov89@gmail.com>

e608d9f7

drm/sched: Rename drm_sched_get_cleanup_job to be more descriptive · 7abbbe26

Tvrtko Ursulin authored Nov 02, 2023

"Get cleanup job" makes it sound like helper is returning a job which will
execute some cleanup, or something, while the kerneldoc itself accurately
says "fetch the next _finished_ job". So lets rename the helper to be self
documenting.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Luben Tuikov <ltuikov89@gmail.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231102105538.391648-2-tvrtko.ursulin@linux.intel.comReviewed-by: Luben Tuikov <ltuikov89@gmail.com>
Signed-off-by: Luben Tuikov <ltuikov89@gmail.com>

7abbbe26

03 Nov, 2023 2 commits

accel/qaic: Support for 0 resize slice execution in BO · 3b511278

Pranjal Ramajor Asha Kanojiya authored Oct 27, 2023

Add support to partially execute a slice which is resized to zero.
Executing a zero size slice in a BO should mean that there is no DMA
transfers involved but you should still configure doorbell and semaphores.

For example consider a BO of size 18K and it is sliced into 3 6K slices
and user calls partial execute ioctl with resize as 10K.
slice 0 - size is 6k and offset is 0, so resize of 10K will not cut short
this slice hence we send the entire slice for execution.
slice 1 - size is 6k and offset is 6k, so resize of 10K will cut short this
slice and only the first 4k should be DMA along with configuring
doorbell and semaphores.
slice 2 - size is 6k and offset is 12k, so resize of 10k will cut short
this slice and no DMA transfer would be involved but we should
would configure doorbell and semaphores.

This change begs to change the behavior of 0 resize. Currently, 0 resize
partial execute ioctl behaves exactly like execute ioctl i.e. no resize.
After this patch all the slice in BO should behave exactly like slice 2 in
above example.

Refactor copy_partial_exec_reqs() to make it more readable and less
complex.
Signed-off-by: Pranjal Ramajor Asha Kanojiya <quic_pkanojiy@quicinc.com>
Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Signed-off-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231027164330.11978-1-quic_jhugo@quicinc.com

3b511278

accel/qaic: Quiet array bounds check on DMA abort message · 44793c6a

Carl Vanderlip authored Oct 27, 2023

Current wrapper is right-sized to the message being transferred;
however, this is smaller than the structure defining message wrappers
since the trailing element is a union of message/transfer headers of
various sizes (8 and 32 bytes on 32-bit system where issue was
reported). Using the smaller header with a small message
(wire_trans_dma_xfer is 24 bytes including header) ends up being smaller
than a wrapper with the larger header. There are no accesses outside of
the defined size, however they are possible if the larger union member
is referenced.

Abort messages are outside of hot-path and changing the wrapper struct
would require a larger rewrite, so having the memory allocated to the
message be 8 bytes too big is acceptable.
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202310182253.bcb9JcyJ-lkp@intel.com/Signed-off-by: Carl Vanderlip <quic_carlv@quicinc.com>
Reviewed-by: Pranjal Ramajor Asha Kanojiya <quic_pkanojiy@quicinc.com>
Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Signed-off-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231027180810.4873-1-quic_jhugo@quicinc.com

44793c6a

02 Nov, 2023 6 commits

drm/v3d: add brcm,2712-v3d as a compatible V3D device · 6fd94871

Iago Toral Quiroga authored Oct 31, 2023

This is required to get the V3D module to load with Raspberry Pi 5.
Signed-off-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Stefan Wahren <wahrenst@gmx.net>
Reviewed-by: Maíra Canal <mcanal@igalia.com>
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231031073859.25298-5-itoral@igalia.com

6fd94871

dt-bindings: gpu: v3d: Add BCM2712's compatible · ebb2f6ee

Iago Toral Quiroga authored Oct 31, 2023

BCM2712, Raspberry Pi 5's SoC, contains a V3D core. So add its specific
compatible to the bindings.
Signed-off-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Maíra Canal <mcanal@igalia.com>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231031073859.25298-4-itoral@igalia.com

ebb2f6ee

drm/v3d: fix up register addresses for V3D 7.x · 0ad5bc1c

Iago Toral Quiroga authored Oct 31, 2023

This patch updates a number of register addresses that have
been changed in Raspberry Pi 5 (V3D 7.1) and updates the
code to use the corresponding registers and addresses based
on the actual V3D version.
Signed-off-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Maíra Canal <mcanal@igalia.com>
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231031073859.25298-3-itoral@igalia.com

0ad5bc1c

drm/v3d: update UAPI to match user-space for V3D 7.x · 1118d10f

Iago Toral Quiroga authored Oct 31, 2023

V3D 7.x takes a new parameter to configure TFU jobs that needs
to be provided by user space.
Signed-off-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Maíra Canal <mcanal@igalia.com>
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231031073859.25298-2-itoral@igalia.com

1118d10f

fbdev/simplefb: Add support for generic power-domains · 92a511a5

Thierry Reding authored Nov 01, 2023

The simple-framebuffer device tree bindings document the power-domains
property, so make sure that simplefb supports it. This ensures that the
power domains remain enabled as long as simplefb is active.

v2: - remove unnecessary call to simplefb_detach_genpds() since that's
      already done automatically by devres
    - fix crash if power-domains property is missing in DT
Signed-off-by: Thierry Reding <treding@nvidia.com>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231101172017.3872242-3-thierry.reding@gmail.com

92a511a5

fbdev/simplefb: Support memory-region property · 8ddfc01a

Thierry Reding authored Nov 01, 2023

The simple-framebuffer bindings specify that the "memory-region"
property can be used as an alternative to the "reg" property to define
the framebuffer memory used by the display hardware. Implement support
for this in the simplefb driver.
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231101172017.3872242-2-thierry.reding@gmail.com

8ddfc01a

01 Nov, 2023 6 commits

drm/sched: Add a helper to queue TDR immediately · 3c6c7ca4

Matthew Brost authored Oct 30, 2023

Add a helper whereby a driver can invoke TDR immediately.

v2:
 - Drop timeout args, rename function, use mod delayed work (Luben)
v3:
 - s/XE/Xe (Luben)
 - present tense in commit message (Luben)
 - Adjust comment for drm_sched_tdr_queue_imm (Luben)
v4:
 - Adjust commit message (Luben)

Cc: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Link: https://lore.kernel.org/r/20231031032439.1558703-6-matthew.brost@intel.comSigned-off-by: Luben Tuikov <ltuikov89@gmail.com>

3c6c7ca4

drm/sched: Add drm_sched_start_timeout_unlocked helper · 7a36dcfa

Matthew Brost authored Oct 30, 2023

Also add a lockdep assert to drm_sched_start_timeout.
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Link: https://lore.kernel.org/r/20231031032439.1558703-5-matthew.brost@intel.comSigned-off-by: Luben Tuikov <ltuikov89@gmail.com>

7a36dcfa

drm/sched: Split free_job into own work item · f7fe64ad

Matthew Brost authored Oct 30, 2023

Rather than call free_job and run_job in same work item have a dedicated
work item for each. This aligns with the design and intended use of work
queues.

v2:
   - Test for DMA_FENCE_FLAG_TIMESTAMP_BIT before setting
     timestamp in free_job() work item (Danilo)
v3:
  - Drop forward dec of drm_sched_select_entity (Boris)
  - Return in drm_sched_run_job_work if entity NULL (Boris)
v4:
  - Replace dequeue with peek and invert logic (Luben)
  - Wrap to 100 lines (Luben)
  - Update comments for *_queue / *_queue_if_ready functions (Luben)
v5:
  - Drop peek argument, blindly reinit idle (Luben)
  - s/drm_sched_free_job_queue_if_ready/drm_sched_free_job_queue_if_done (Luben)
  - Update work_run_job & work_free_job kernel doc (Luben)
v6:
  - Do not move drm_sched_select_entity in file (Luben)
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Link: https://lore.kernel.org/r/20231031032439.1558703-4-matthew.brost@intel.comReviewed-by: Luben Tuikov <ltuikov89@gmail.com>
Signed-off-by: Luben Tuikov <ltuikov89@gmail.com>

f7fe64ad

drm/sched: Convert drm scheduler to use a work queue rather than kthread · a6149f03

Matthew Brost authored Oct 30, 2023

In Xe, the new Intel GPU driver, a choice has made to have a 1 to 1
mapping between a drm_gpu_scheduler and drm_sched_entity. At first this
seems a bit odd but let us explain the reasoning below.

1. In Xe the submission order from multiple drm_sched_entity is not
guaranteed to be the same completion even if targeting the same hardware
engine. This is because in Xe we have a firmware scheduler, the GuC,
which allowed to reorder, timeslice, and preempt submissions. If a using
shared drm_gpu_scheduler across multiple drm_sched_entity, the TDR falls
apart as the TDR expects submission order == completion order. Using a
dedicated drm_gpu_scheduler per drm_sched_entity solve this problem.

2. In Xe submissions are done via programming a ring buffer (circular
buffer), a drm_gpu_scheduler provides a limit on number of jobs, if the
limit of number jobs is set to RING_SIZE / MAX_SIZE_PER_JOB we get flow
control on the ring for free.

A problem with this design is currently a drm_gpu_scheduler uses a
kthread for submission / job cleanup. This doesn't scale if a large
number of drm_gpu_scheduler are used. To work around the scaling issue,
use a worker rather than kthread for submission / job cleanup.

v2:
  - (Rob Clark) Fix msm build
  - Pass in run work queue
v3:
  - (Boris) don't have loop in worker
v4:
  - (Tvrtko) break out submit ready, stop, start helpers into own patch
v5:
  - (Boris) default to ordered work queue
v6:
  - (Luben / checkpatch) fix alignment in msm_ringbuffer.c
  - (Luben) s/drm_sched_submit_queue/drm_sched_wqueue_enqueue
  - (Luben) Update comment for drm_sched_wqueue_enqueue
  - (Luben) Positive check for submit_wq in drm_sched_init
  - (Luben) s/alloc_submit_wq/own_submit_wq
v7:
  - (Luben) s/drm_sched_wqueue_enqueue/drm_sched_run_job_queue
v8:
  - (Luben) Adjust var names / comments
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Link: https://lore.kernel.org/r/20231031032439.1558703-3-matthew.brost@intel.comSigned-off-by: Luben Tuikov <ltuikov89@gmail.com>

a6149f03

drm/sched: Add drm_sched_wqueue_* helpers · 35963cf2

Matthew Brost authored Oct 30, 2023

Add scheduler wqueue ready, stop, and start helpers to hide the
implementation details of the scheduler from the drivers.

v2:
  - s/sched_wqueue/sched_wqueue (Luben)
  - Remove the extra white line after the return-statement (Luben)
  - update drm_sched_wqueue_ready comment (Luben)

Cc: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Link: https://lore.kernel.org/r/20231031032439.1558703-2-matthew.brost@intel.comSigned-off-by: Luben Tuikov <ltuikov89@gmail.com>

35963cf2

dma-buf: add dma_fence_timestamp helper · 0da611a8

Christian König authored Sep 08, 2023

When a fence signals there is a very small race window where the timestamp
isn't updated yet. sync_file solves this by busy waiting for the
timestamp to appear, but on other ocassions didn't handled this
correctly.

Provide a dma_fence_timestamp() helper function for this and use it in
all appropriate cases.

Another alternative would be to grab the spinlock when that happens.

v2 by teddy: add a wait parameter to wait for the timestamp to show up, in case
   the accurate timestamp is needed and/or the timestamp is not based on
   ktime (e.g. hw timestamp)
v3 chk: drop the parameter again for unified handling
Signed-off-by: Yunxiang Li <Yunxiang.Li@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Fixes: 1774baa6 ("drm/scheduler: Change scheduled fence track v2")
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
CC: stable@vger.kernel.org
Link: https://patchwork.freedesktop.org/patch/msgid/20230929104725.2358-1-christian.koenig@amd.com

0da611a8

31 Oct, 2023 6 commits

accel/ivpu: Rename VPU to NPU in product strings · 2fc1a50f

Jacek Lawrynowicz authored Oct 28, 2023

VPU was rebranded as NPU (Neural Processing Unit) so user facing
strings have to be updated but the code remains as is and the module
is still called intel_vpu.ko.
Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231028155936.1183342-9-stanislaw.gruszka@linux.intel.com

2fc1a50f

accel/ivpu: Simplify MMU SYNC command · 0c287c27

Jacek Lawrynowicz authored Oct 28, 2023

CMD_SYNC does not need any args as we poll for completion anyway.
Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231028155936.1183342-8-stanislaw.gruszka@linux.intel.com

0c287c27

accel/ivpu: Make DMA allocations for MMU600 write combined · 3bcc5209

Karol Wachowski authored Oct 28, 2023

Previously using dma_alloc_wc() API we created cache coherent
(mapped as write-back) mappings.

Because we disable MMU600 snooping it was required to do costly
page walk and cache flushes after each page table modification.

With write-combined buffers it's possible to do a single write memory
barrier to flush write-combined buffer to memory which simplifies the
driver and significantly reduce time of map/unmap operations.

Mapping time of 255 MB is reduced from 2.5 ms to 500 us.
Signed-off-by: Karol Wachowski <karol.wachowski@linux.intel.com>
Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231028155936.1183342-7-stanislaw.gruszka@linux.intel.com

3bcc5209

accel/ivpu: Print CMDQ errors after consumer timeout · e013aa9a

Karol Wachowski authored Oct 28, 2023

Add checking of error reason bits in IVPU_MMU_CMDQ_CONS
register when waiting for consumer timeout occurred.
Signed-off-by: Karol Wachowski <karol.wachowski@linux.intel.com>
Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231028155936.1183342-6-stanislaw.gruszka@linux.intel.com

e013aa9a

accel/ivpu: Abort pending rx ipc on reset · ba6b035d

Stanislaw Gruszka authored Oct 28, 2023

Waking up process, which wait for particular condition, will go to
sleep again on wake_up() if the condition is not met. Add abort flag
to wake up IPC receivers, which will finish with -ECANCELED error.

This is only needed for reset, run time power management prevent to
suspend VPU when there is pending IPC processing or pending job.
Reviewed-by: Karol Wachowski <karol.wachowski@linux.intel.com>
Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231028155936.1183342-5-stanislaw.gruszka@linux.intel.com

ba6b035d

accel/ivpu: Stop job_done_thread on suspend · 57c7e3e4

Stanislaw Gruszka authored Oct 28, 2023

Stop job_done thread when going to suspend. Use kthread_park() instead
of kthread_stop() to avoid memory allocation and potential failure
on resume.

Use separate function as thread wake up condition. Use spin lock to assure
rx_msg_list is properly protected against concurrent access. This avoid
race condition when the rx_msg_list list is modified and read in
ivpu_ipc_recive() at the same time.
Reviewed-by: Karol Wachowski <karol.wachowski@linux.intel.com>
Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231028155936.1183342-4-stanislaw.gruszka@linux.intel.com

57c7e3e4