Commits · 16423d67936f87e320a7b11771675b982cc9de02 · nexedi / linux

15 Jul, 2014 2 commits

Update MAINTAINERS and CREDITS files with amdkfd info · 16423d67

Oded Gabbay authored Jul 15, 2014

v6: Update entries to reflect new name & location of driver
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>

16423d67

drm/radeon: Add radeon <--> amdkfd interface · e28740ec

Oded Gabbay authored Jul 15, 2014

This patch adds the interface between the radeon driver and the amdkfd driver.
The interface implementation is contained in radeon_kfd.c and radeon_kfd.h.

The interface itself is represented by a pointer to struct
kfd_dev. The pointer is located inside radeon_device structure.

All the register accesses that amdkfd need are done using this interface. This
allows us to avoid direct register accesses in amdkfd proper,  while also
avoiding locking between amdkfd and radeon.

The single exception is the doorbells that are used in both of the drivers.
However, because they are located in separate pci bar pages, the danger of
sharing registers between the drivers is minimal.

Having said that, we are planning to move the doorbells as well to radeon.

v3:

Add interface for sa manager init and fini. The init function will allocate a
buffer on system memory and pin it to the GART address space via the radeon sa
manager.

All mappings of buffers to GART address space are done via the radeon sa
manager. The interface of allocate memory will use the radeon sa manager to sub
allocate from the single buffer that was allocated during the init function.

Change lower_32/upper_32 calls to use linux macros

Add documentation for the interface

v4:

Change ptr field type in kgd_mem from uint32_t* to void* to match to type that
is returned by radeon_sa_bo_cpu_addr

v5:

Change format of mqd structure to work with latest KV firmware
Add support for AQL queues creation to enable working with open-source HSA
runtime.
Move generic kfd-->kgd interface and other generic kgd definitions to a generic
header file that will be used by AMD's radeon and amdgpu drivers
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>

e28740ec

14 Jul, 2014 1 commit

drm/radeon: adding synchronization for GRBM GFX · 1c0a4625

Oded Gabbay authored Jul 14, 2014

Implementing a lock for selecting and accessing shader engines and arrays.
This lock will make sure that radeon and amdkfd are not colliding when
accessing shader engines and arrays with GRBM_GFX_INDEX register.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>

1c0a4625

28 Jan, 2014 1 commit

drm/radeon: Report doorbell configuration to amdkfd · ebff8453

Oded Gabbay authored Jan 28, 2014

radeon and amdkfd share the doorbell aperture.
radeon sets it up, takes the doorbells required for its own rings
and reports the setup to amdkfd.
radeon reserved doorbells are at the start of the doorbell aperture.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>

ebff8453

11 Feb, 2014 1 commit

drm/radeon/cik: Don't touch int of pipes 1-7 · 28b57b85

Oded Gabbay authored Feb 11, 2014

amdkfd should set interrupts for pipes 1-7.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>

28b57b85

16 Jan, 2014 1 commit

drm/radeon: reduce number of free VMIDs and pipes in KV · 62a7b7fb

Oded Gabbay authored Jan 16, 2014

To support HSA on KV, we need to limit the number of vmids and pipes
that are available for radeon's use with KV.

This patch reserves VMIDs 8-15 for amdkfd (so radeon can only use VMIDs
0-7) and also makes radeon thinks that KV has only a single MEC with a single
pipe in it

v3: Use define for static vmid allocation in radeon
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>

62a7b7fb

10 Nov, 2014 1 commit

iommu/amd: fix accounting of device_state · a015c1e9

Oded Gabbay authored Nov 10, 2014

This patch fixes a bug in the accounting of the device_state.
In the current code, the device_state was put (decremented) too many times,
which sometimes lead to the driver getting stuck permanently in
put_device_state_wait(). That happen because the device_state->count would go
below zero, which is never supposed to happen.

The root cause is that the device_state was decremented in put_pasid_state()
and put_pasid_state_wait() but also in all the functions that call those
functions. Therefore, the device_state was decremented twice in each of these
code paths.

The fix is to decouple the device_state accounting from the pasid_state
accounting - remove the call to put_device_state() from the
put_pasid_state() and the put_pasid_state_wait())
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>

a015c1e9

13 Nov, 2014 4 commits

iommu/amd: use new invalidate_range mmu-notifier · e7cc3dd4

Joerg Roedel authored Nov 13, 2014

Make use of the new invalidate_range mmu_notifier call-back and remove the
old logic of assigning an empty page-table between invalidate_range_start
and invalidate_range_end.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Tested-by: Oded Gabbay <oded.gabbay@amd.com>
Reviewed-by: Andrea Arcangeli <aarcange@redhat.com>
Reviewed-by: Jérôme Glisse <jglisse@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Rik van Riel <riel@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Johannes Weiner <jweiner@redhat.com>
Cc: Jay Cornwall <Jay.Cornwall@amd.com>
Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>

e7cc3dd4

mmu_notifier: add the callback for mmu_notifier_invalidate_range() · 0f0a327f

Joerg Roedel authored Nov 13, 2014

Now that the mmu_notifier_invalidate_range() calls are in place, add the
callback to allow subsystems to register against it.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Reviewed-by: Andrea Arcangeli <aarcange@redhat.com>
Reviewed-by: Jérôme Glisse <jglisse@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Rik van Riel <riel@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Johannes Weiner <jweiner@redhat.com>
Cc: Jay Cornwall <Jay.Cornwall@amd.com>
Cc: Oded Gabbay <Oded.Gabbay@amd.com>
Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>

0f0a327f

mmu_notifier: call mmu_notifier_invalidate_range() from VMM · 34ee645e

Joerg Roedel authored Nov 13, 2014

Add calls to the new mmu_notifier_invalidate_range() function to all
places in the VMM that need it.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Reviewed-by: Andrea Arcangeli <aarcange@redhat.com>
Reviewed-by: Jérôme Glisse <jglisse@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Rik van Riel <riel@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Johannes Weiner <jweiner@redhat.com>
Cc: Jay Cornwall <Jay.Cornwall@amd.com>
Cc: Oded Gabbay <Oded.Gabbay@amd.com>
Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>

34ee645e

mmu_notifier: add mmu_notifier_invalidate_range() · 1897bdc4

Joerg Roedel authored Nov 13, 2014

This notifier closes an important gap in the current mmu_notifier
implementation, the existing callbacks are called too early or too late to
reliably manage a non-CPU TLB.  Specifically, invalidate_range_start() is
called when all pages are still mapped and invalidate_range_end() when all
pages are unmapped and potentially freed.

This is fine when the users of the mmu_notifiers manage their own SoftTLB,
like KVM does.  When the TLB is managed in software it is easy to wipe out
entries for a given range and prevent new entries to be established until
invalidate_range_end is called.

But when the user of mmu_notifiers has to manage a hardware TLB it can
still wipe out TLB entries in invalidate_range_start, but it can't make
sure that no new TLB entries in the given range are established between
invalidate_range_start and invalidate_range_end.

To avoid silent data corruption the entries in the non-CPU TLB need to be
flushed when the pages are unmapped (at this point in time no _new_ TLB
entries can be established in the non-CPU TLB) but not yet freed (as the
non-CPU TLB may still have _existing_ entries pointing to the pages about
to be freed).

To fix this problem we need to catch the moment when the Linux VMM flushes
remote TLBs (as a non-CPU TLB is not very CPU TLB), as this is the point
in time when the pages are unmapped but _not_ yet freed.

The mmu_notifier_invalidate_range() function aims to catch that moment.

IOMMU code will be one user of the notifier-callback.  Currently this is
only the AMD IOMMUv2 driver, but its code is about to be more generalized
and converted to a generic IOMMU-API extension to fit the needs of similar
functionality in other IOMMUs as well.

The current attempt in the AMD IOMMUv2 driver to work around the
invalidate_range_start/end() shortcoming is to assign an empty page table
to the non-CPU TLB between any invalidata_range_start/end calls.  With the
empty page-table assigned, every page-table walk to re-fill the non-CPU
TLB will cause a page-fault reported to the IOMMU driver via an interrupt,
possibly causing interrupt storms.

The page-fault handler in the AMD IOMMUv2 driver doesn't handle the fault
if an invalidate_range_start/end pair is active, it just reports back
SUCCESS to the device and let it refault the page.  But existing hardware
(newer Radeon GPUs) that makes use of this feature don't re-fault
indefinitly, after a certain number of faults for the same address the
device enters a failure state and needs to be resetted.

To avoid the GPUs entering a failure state we need to get rid of the
empty-page-table workaround and use the mmu_notifier_invalidate_range()
function introduced with this patch.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Reviewed-by: Andrea Arcangeli <aarcange@redhat.com>
Reviewed-by: Jérôme Glisse <jglisse@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Rik van Riel <riel@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Johannes Weiner <jweiner@redhat.com>
Cc: Jay Cornwall <Jay.Cornwall@amd.com>
Cc: Oded Gabbay <Oded.Gabbay@amd.com>
Cc: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>

1897bdc4

12 Nov, 2014 29 commits

Merge branch 'drm-next-3.19' of git://people.freedesktop.org/~agd5f/linux into drm-next · 7fd36c0b

Dave Airlie authored Nov 13, 2014

Radeon patches for 3.19.  Christian has a number of GPUVM improvements
slated as well, but I'd like to wait until he gets back to work next week
to pull those in. Highlights of this pull:
- ttm performance improvements
- CI dpm fixes

* 'drm-next-3.19' of git://people.freedesktop.org/~agd5f/linux: (26 commits)
  drm/radeon/si/ci: make u8 static arrays constant
  drm/radeon: set power control in ci dpm enable
  drm/radeon: powertune fixes for hawaii
  drm/radeon: fix dpm mc init for certain hawaii boards
  drm/radeon: set bootup pcie level to max for ci dpm
  drm/radeon: fix default dpm state setup
  drm/radeon: workaround a hw bug in bonaire pcie dpm
  drm/radeon: fix mclk vddc configuration for cards for hawaii
  drm/radeon: fix sclk DS enablement
  drm/radeon: fix activity settings for sclk and mclk for CI
  drm/radeon: improve mclk param calcuations for ci dpm
  drm/radeon: fix dram timing for certain hawaii boards
  drm/radeon: switch force state commands for CI
  drm/radeon: fix for memory training on bonaire 0x6649
  drm/radeon/ci: handle gpio controlled dpm features properly
  drm/radeon: store the gpio shift as well
  drm/radeon: export radeon_atombios_lookup_gpio
  drm/radeon: fix typo in CI dpm disable
  drm/radeon: rework CI dpm thermal setup
  drm/radeon: rework SI dpm thermal setup
  ...

7fd36c0b

drm/radeon/si/ci: make u8 static arrays constant · c81b9942

Dave Airlie authored Nov 10, 2014

These two arrays don't change, just make them constant,
reduces data segment by a few bytes.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

c81b9942

drm/radeon: set power control in ci dpm enable · b94b95e7
Alex Deucher authored Nov 07, 2014
```
Necessary for poper operation.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
```
b94b95e7

drm/radeon: powertune fixes for hawaii · 542b379b

Alex Deucher authored Nov 07, 2014

- bapm is not available on hawaii
- update pt defaults
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

542b379b

drm/radeon: fix dpm mc init for certain hawaii boards · 90b2fee3

Alex Deucher authored Nov 07, 2014

Needs special overrides for certain vram configurations.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

90b2fee3

drm/radeon: set bootup pcie level to max for ci dpm · 4e21518c

Alex Deucher authored Nov 07, 2014

Avoids problems when re-loading the driver.  Does not
affect power saving when dpm is enabled.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

4e21518c

drm/radeon: fix default dpm state setup · b6b41cf3

Alex Deucher authored Nov 07, 2014

Only enable the first levels for mclk and sclk.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

b6b41cf3

drm/radeon: workaround a hw bug in bonaire pcie dpm · 36654dd4
Alex Deucher authored Nov 07, 2014
```
Some boards get stuck in pcie x1 otherwise.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
```
36654dd4

drm/radeon: fix mclk vddc configuration for cards for hawaii · 127e056e

Alex Deucher authored Nov 07, 2014

Need to use vddc0 for vdcc1 for certain hawaii configurations.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

127e056e

drm/radeon: fix sclk DS enablement · 489ba72c

Alex Deucher authored Nov 07, 2014

Only enable it for levels 0 and 1.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

489ba72c

drm/radeon: fix activity settings for sclk and mclk for CI · d3052b8c
Alex Deucher authored Nov 07, 2014
```
Only need to be enabled on the first level.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
```
d3052b8c
drm/radeon: improve mclk param calcuations for ci dpm · c0392f8f
Alex Deucher authored Nov 07, 2014
```
Properly take into account the post divider.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
```
c0392f8f
drm/radeon: fix dram timing for certain hawaii boards · 21b8a369
Alex Deucher authored Nov 07, 2014
```
Certain memory configurations need a fix.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
```
21b8a369

drm/radeon: switch force state commands for CI · 1c52279f

Alex Deucher authored Nov 07, 2014

Use the preferred SMC commands for forcing state on CI.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

1c52279f

drm/radeon: fix for memory training on bonaire 0x6649 · 9feb3dda

Alex Deucher authored Nov 07, 2014

Workaround for memory link training on certain variants
of 0x6649.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

9feb3dda

drm/radeon/ci: handle gpio controlled dpm features properly · 34fc0b58

Alex Deucher authored Nov 07, 2014

Certain feature enablement depends on entries in the atom
gpio pin table.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

34fc0b58

drm/radeon: store the gpio shift as well · 727b3d25
Alex Deucher authored Nov 07, 2014
```
We need this in the dpm code.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
```
727b3d25
drm/radeon: export radeon_atombios_lookup_gpio · 09e619c0
Alex Deucher authored Nov 07, 2014
```
We need it for dpm.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
```
09e619c0

drm/radeon: fix typo in CI dpm disable · 129acb7c

Alex Deucher authored Nov 07, 2014

Need to disable DS, not enable it when disabling dpm.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org

129acb7c

drm/radeon: rework CI dpm thermal setup · 1955f107
Alex Deucher authored Sep 14, 2014
```
In preparation for fan control.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
```
1955f107
drm/radeon: rework SI dpm thermal setup · 2271e2e2
Alex Deucher authored Sep 08, 2014
```
In preparation for fan control.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
```
2271e2e2

drm/radeon/dpm: grab fan info from vbios · 9b92d1ec

Alex Deucher authored Sep 08, 2014

Required for fan control support.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

9b92d1ec

drm/ttm: Use only DRM_MM_SEARCH_BELOW for TTM_PL_FLAG_TOPDOWN · 507d0ca7

Michel Dänzer authored Oct 28, 2014

DRM_MM_SEARCH_BEST gets the smallest hole which can fit the BO. That seems
against the idea of TTM_PL_FLAG_TOPDOWN:

* The smallest hole may be in the overall bottom of the area
* If the hole isn't much larger than the BO, it doesn't make much
  difference whether the BO is placed at the bottom or at the top of the
  hole
Reviewed-by: Lauri Kasanen <cand@gmx.com>
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

507d0ca7

drm/ttm: Add DRM_MM_SEARCH_BELOW for TTM_PL_FLAG_TOPDOWN · c165812c

Michel Dänzer authored Oct 28, 2014

If the BO should be placed at the top of the area, we should start looking
for holes from the top.
Reviewed-by: Lauri Kasanen <cand@gmx.com>
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

c165812c

drm/radeon: Set TTM_PL_FLAG_TOPDOWN also for RADEON_GEM_CPU_ACCESS BOs · a8b5ebe6

Michel Dänzer authored Oct 28, 2014

I wasn't sure if TTM_PL_FLAG_TOPDOWN works correctly with non-0 lpfn, but
AFAICT it does.
Reviewed-by: Lauri Kasanen <cand@gmx.com>
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

a8b5ebe6

drm/radeon: Try evicting from CPU accessible to inaccessible VRAM first · 2a85aedd
Michel Dänzer authored Oct 09, 2014
```
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
```
2a85aedd

drm/radeon: Try placing NO_CPU_ACCESS BOs outside of CPU accessible VRAM · c9da4a4b

Michel Dänzer authored Oct 10, 2014

This avoids them getting in the way of BOs which might be accessed by
the CPU. They can still go to the CPU accessible part of VRAM though if
there's no space outside of it.
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>

c9da4a4b

drm: More specific locking for get* ioctls · fcf93f69

Daniel Vetter authored Nov 12, 2014

Motivated by the per-plane locking I've gone through all the get*
ioctls and reduced the locking to the bare minimum required.

v2: Rebase and make it compile ...

v3: Review from Sean:
- Simplify return handling in getplane_res.
- Add a comment to getplane_res that the plane list is invariant and
  can be walked locklessly.

v4: Actually git add.

Cc: Sean Paul <seanpaul@chromium.org>
Reviewed-by: Sean Paul <seanpaul@chromium.org>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>

fcf93f69

drm: Per-plane locking · 4d02e2de

Daniel Vetter authored Nov 11, 2014

Turned out to be much simpler on top of my latest atomic stuff than
what I've feared. Some details:

- Drop the modeset_lock_all snakeoil in drm_plane_init. Same
  justification as for the equivalent change in drm_crtc_init done in

	commit d0fa1af4
	Author: Daniel Vetter <daniel.vetter@ffwll.ch>
	Date:   Mon Sep 8 09:02:49 2014 +0200

	    drm: Drop modeset locking from crtc init function

  Without these the drm_modeset_lock_init would fall over the exact
  same way.

- Since the atomic core code wraps the locking switching it to
  per-plane locks was a one-line change.

- For the legacy ioctls add a plane argument to the locking helper so
  that we can grab the right plane lock (cursor or primary). Since the
  universal cursor plane might not be there, or someone really crazy
  might forgoe the primary plane even accept NULL.

- Add some locking WARN_ON to the atomic helpers for good paranoid
  measure and to check that it all works out.

Tested on my exynos atomic hackfest with full lockdep checks and ww
backoff injection.

v2: I've forgotten about the load-detect code in i915.

v3: Thierry reported that in latest 3.18-rc vmwgfx doesn't compile any
more due to

commit 21e88620
Author: Rob Clark <robdclark@gmail.com>
Date:   Thu Oct 30 13:39:04 2014 -0400

    drm/vmwgfx: fix lock breakage

Rebased and fix this up.

Cc: Thierry Reding <thierry.reding@gmail.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Reviewed-by: Sean Paul <seanpaul@chromium.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>

4d02e2de