1. 03 Sep, 2020 10 commits
    • xinhui pan's avatar
      drm/amd/display: Fix a list corruption · 1545fbf9
      xinhui pan authored
      Remove the private obj from the internal list before we free aconnector.
      
      [   56.925828] BUG: unable to handle page fault for address: ffff8f84a870a560
      [   56.933272] #PF: supervisor read access in kernel mode
      [   56.938801] #PF: error_code(0x0000) - not-present page
      [   56.944376] PGD 18e605067 P4D 18e605067 PUD 86a614067 PMD 86a4d0067 PTE 800ffff8578f5060
      [   56.953260] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC NOPTI
      [   56.958815] CPU: 6 PID: 1407 Comm: bash Tainted: G           O      5.9.0-rc2+ #46
      [   56.967092] Hardware name: System manufacturer System Product Name/PRIME Z390-A, BIOS 1401 11/26/2019
      [   56.977162] RIP: 0010:__list_del_entry_valid+0x31/0xa0
      [   56.982768] Code: 00 ad de 55 48 8b 17 4c 8b 47 08 48 89 e5 48 39 c2 74 27 48 b8 22 01 00 00 00 00 ad de 49 39 c0 74 2d 49 8b 30 48 39 fe 75 3d <48> 8b 52 08 48 39 f2 75 4c b8 01 00 00 00 5d c3 48 89 7
      [   57.003327] RSP: 0018:ffffb40c81687c90 EFLAGS: 00010246
      [   57.009048] RAX: dead000000000122 RBX: ffff8f84ea41f4f0 RCX: 0000000000000006
      [   57.016871] RDX: ffff8f84a870a558 RSI: ffff8f84ea41f4f0 RDI: ffff8f84ea41f4f0
      [   57.024672] RBP: ffffb40c81687c90 R08: ffff8f84ea400998 R09: 0000000000000001
      [   57.032490] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000006
      [   57.040287] R13: ffff8f84ea422a90 R14: ffff8f84b4129a20 R15: fffffffffffffff2
      [   57.048105] FS:  00007f550d885740(0000) GS:ffff8f8509600000(0000) knlGS:0000000000000000
      [   57.056979] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [   57.063260] CR2: ffff8f84a870a560 CR3: 00000007e5144001 CR4: 00000000003706e0
      [   57.071053] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [   57.078849] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [   57.086684] Call Trace:
      [   57.089381]  drm_atomic_private_obj_fini+0x29/0x82 [drm]
      [   57.095247]  amdgpu_dm_fini+0x83/0x170 [amdgpu]
      [   57.100264]  dm_hw_fini+0x23/0x30 [amdgpu]
      [   57.104814]  amdgpu_device_fini+0x1df/0x4fe [amdgpu]
      [   57.110271]  amdgpu_driver_unload_kms+0x43/0x70 [amdgpu]
      [   57.116136]  amdgpu_pci_remove+0x3b/0x60 [amdgpu]
      [   57.121291]  pci_device_remove+0x3e/0xb0
      [   57.125583]  device_release_driver_internal+0xff/0x1d0
      [   57.131223]  device_release_driver+0x12/0x20
      [   57.135903]  pci_stop_bus_device+0x70/0xa0
      [   57.140401]  pci_stop_and_remove_bus_device_locked+0x1b/0x30
      [   57.146571]  remove_store+0x7b/0x90
      [   57.150429]  dev_attr_store+0x17/0x30
      [   57.154441]  sysfs_kf_write+0x4b/0x60
      [   57.158479]  kernfs_fop_write+0xe8/0x1d0
      [   57.162788]  vfs_write+0xf5/0x230
      [   57.166426]  ksys_write+0x70/0xf0
      [   57.170087]  __x64_sys_write+0x1a/0x20
      [   57.174219]  do_syscall_64+0x38/0x90
      [   57.178145]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
      Signed-off-by: default avatarxinhui pan <xinhui.pan@amd.com>
      Acked-by: Feifei Xu <Feifei Xu@amd.com>
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      1545fbf9
    • xinhui pan's avatar
      drm/amdgpu: Fix a redundant kfree · 3d7248d7
      xinhui pan authored
      drm_dev_alloc() alloc *dev* and set managed.final_kfree to dev to free
      itself.
      Now from commit 5cdd68498918("drm/amdgpu: Embed drm_device into
      amdgpu_device (v3)") we alloc *adev* and ddev is just a member of it.
      So drm_dev_release try to free a wrong pointer then.
      
      Also driver's release trys to free adev, but drm_dev_release will
      access dev after call drvier's release.
      
      To fix it, remove driver's release and set managed.final_kfree to adev.
      
      [   36.269348] BUG: unable to handle page fault for address: ffffa0c279940028
      [   36.276841] #PF: supervisor read access in kernel mode
      [   36.282434] #PF: error_code(0x0000) - not-present page
      [   36.288053] PGD 676601067 P4D 676601067 PUD 86a414067 PMD 86a247067 PTE 800ffff8066bf060
      [   36.296868] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC NOPTI
      [   36.302409] CPU: 4 PID: 1375 Comm: bash Tainted: G           O      5.9.0-rc2+ #46
      [   36.310670] Hardware name: System manufacturer System Product Name/PRIME Z390-A, BIOS 1401 11/26/2019
      [   36.320725] RIP: 0010:drm_managed_release+0x25/0x110 [drm]
      [   36.326741] Code: 80 00 00 00 00 0f 1f 44 00 00 55 48 c7 c2 5a 9f 41 c0 be 00 02 00 00 48 89 e5 41 57 41 56 41 55 41 54 49 89 fc 53 48 83 ec 08 <48> 8b 7f 18 e8 c2 10 ff ff 4d 8b 74 24 20 49 8d 44 24 5
      [   36.347217] RSP: 0018:ffffb9424141fce0 EFLAGS: 00010282
      [   36.352931] RAX: 0000000000000006 RBX: ffffa0c279940010 RCX: 0000000000000006
      [   36.360718] RDX: ffffffffc0419f5a RSI: 0000000000000200 RDI: ffffa0c279940010
      [   36.368503] RBP: ffffb9424141fd10 R08: 0000000000000001 R09: 0000000000000001
      [   36.376304] R10: 0000000000000000 R11: 0000000000000000 R12: ffffa0c279940010
      [   36.384070] R13: ffffffffc0e2a000 R14: ffffa0c26924e220 R15: fffffffffffffff2
      [   36.391845] FS:  00007fc4a277b740(0000) GS:ffffa0c288e00000(0000) knlGS:0000000000000000
      [   36.400669] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [   36.406937] CR2: ffffa0c279940028 CR3: 0000000792304006 CR4: 00000000003706e0
      [   36.414732] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [   36.422550] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [   36.430354] Call Trace:
      [   36.433044]  drm_dev_put.part.0+0x40/0x60 [drm]
      [   36.438017]  drm_dev_put+0x13/0x20 [drm]
      [   36.442398]  amdgpu_pci_remove+0x56/0x60 [amdgpu]
      [   36.447528]  pci_device_remove+0x3e/0xb0
      [   36.451807]  device_release_driver_internal+0xff/0x1d0
      [   36.457416]  device_release_driver+0x12/0x20
      [   36.462094]  pci_stop_bus_device+0x70/0xa0
      [   36.466588]  pci_stop_and_remove_bus_device_locked+0x1b/0x30
      [   36.472786]  remove_store+0x7b/0x90
      [   36.476614]  dev_attr_store+0x17/0x30
      [   36.480646]  sysfs_kf_write+0x4b/0x60
      [   36.484655]  kernfs_fop_write+0xe8/0x1d0
      [   36.488952]  vfs_write+0xf5/0x230
      [   36.492562]  ksys_write+0x70/0xf0
      [   36.496206]  __x64_sys_write+0x1a/0x20
      [   36.500292]  do_syscall_64+0x38/0x90
      [   36.504219]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
      Signed-off-by: default avatarxinhui pan <xinhui.pan@amd.com>
      Acked-by: default avatarAlex Deucher <alexancer.deucher@amd.com>
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      3d7248d7
    • Dennis Li's avatar
      drm/amdgpu: block ring buffer access during GPU recovery · 81202807
      Dennis Li authored
      When GPU is in reset, its status isn't stable and ring buffer also need
      be reset when resuming. Therefore driver should protect GPU recovery
      thread from ring buffer accessed by other threads. Otherwise GPU will
      randomly hang during recovery.
      
      v2: correct indent
      Reviewed-by: default avatarHawking Zhang <Hawking.Zhang@amd.com>
      Reviewed-by: default avatarChristian König <christian.koenig@amd.com>
      Signed-off-by: default avatarDennis Li <Dennis.Li@amd.com>
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      81202807
    • Alex Deucher's avatar
      drm/amdgpu/swsmu: handle manual fan readback on SMU11 · f6eb4339
      Alex Deucher authored
      Need to read back from registers for manual mode rather than
      using the metrics table.
      
      Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1164Reviewed-by: default avatarEvan Quan <evan.quan@amd.com>
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      f6eb4339
    • Alex Deucher's avatar
      drm/amdgpu/swsmu: add smu11 helper to get manual fan speed (v2) · 9a7fd013
      Alex Deucher authored
      Will be used to fetch the fan speeds when manual fan mode is
      set.
      
      v2: squash in a Coverity fix from Colin Ian King
      Reviewed-by: default avatarEvan Quan <evan.quan@amd.com>
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      9a7fd013
    • Alex Deucher's avatar
      drm/amdgpu/swsmu: drop set_fan_speed_percent (v2) · 8d6e65ad
      Alex Deucher authored
      No longer needed as we can calculate it based on
      the fan's max rpm.
      
      v2: minor code rework
      Reviewed-by: default avatarEvan Quan <evan.quan@amd.com>
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      8d6e65ad
    • Alex Deucher's avatar
      drm/amdgpu/swsmu: drop get_fan_speed_percent (v2) · eff64742
      Alex Deucher authored
      No longer needed as we can calculate it based on
      the fan's max rpm.
      
      v2: rework code to avoid possible uninitialized
      variable use.
      Reviewed-by: default avatarEvan Quan <evan.quan@amd.com>
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      eff64742
    • Alex Deucher's avatar
      drm/amdgpu/swsmu: add get_fan_parameters callbacks for smu11 asics · 3204ff3e
      Alex Deucher authored
      grab the value from the pptable.
      Reviewed-by: default avatarEvan Quan <evan.quan@amd.com>
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      3204ff3e
    • Alex Deucher's avatar
      drm/amdgpu/swsmu: add new callback for getting fan parameters · 337b57ae
      Alex Deucher authored
      To fetch the max rpm from pptable.
      Reviewed-by: default avatarEvan Quan <evan.quan@amd.com>
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      337b57ae
    • Nirmoy Das's avatar
      drm/amdgpu: disable gpu-sched load balance for uvd · bc21585f
      Nirmoy Das authored
      On hardware with multiple uvd instances, dependent uvd jobs
      may get scheduled to different uvd instances. Because uvd_enc
      jobs retain hw context, dependent jobs should always run on the
      same uvd instance. This patch disables GPU scheduler's load balancer
      for a context that binds jobs from the same context to a uvd
      instance.
      
      v2: Squash in uvd_enc fix
      Signed-off-by: default avatarNirmoy Das <nirmoy.das@amd.com>
      Reviewed-by: default avatarChristian König <christian.koenig@amd.com>
      Reviewed-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      bc21585f
  2. 31 Aug, 2020 2 commits
  3. 28 Aug, 2020 1 commit
    • Nirmoy Das's avatar
      drm/amdgpu: fix compiler warnings · e230ac11
      Nirmoy Das authored
      Fixes below compiler warnings:
       CC [M]  drivers/gpu/drm/amd/amdgpu/amdgpu_device.o
      drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:381:1: warning: ‘static’ is not at beginning of declaration [-Wold-style-declaration]
        381 | void static inline amdgpu_mm_wreg_mmio(struct amdgpu_device *adev, uint32_t reg, uint32_t v, uint32_t acc_flags)
            | ^~~~
      drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:381:1: warning: ‘inline’ is not at beginning of declaration [-Wold-style-declaration]
      drivers/gpu/drm/amd/amdgpu/amdgpu_device.c: In function ‘amdgpu_device_fini’:
      drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:3381:6: warning: variable ‘r’ set but not used [-Wunused-but-set-variable]
       3381 |  int r;
            |      ^
      Signed-off-by: default avatarNirmoy Das <nirmoy.das@amd.com>
      Reviewed-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      e230ac11
  4. 27 Aug, 2020 8 commits
  5. 26 Aug, 2020 19 commits