1. 11 Dec, 2021 2 commits
  2. 09 Dec, 2021 6 commits
  3. 08 Dec, 2021 5 commits
  4. 07 Dec, 2021 2 commits
  5. 03 Dec, 2021 4 commits
  6. 01 Dec, 2021 7 commits
  7. 30 Nov, 2021 1 commit
  8. 26 Nov, 2021 3 commits
    • Matthew Auld's avatar
      drm/i915/gemfs: don't mark huge_opt as static · 3ccadbce
      Matthew Auld authored
      vfs_kernel_mount() modifies the passed in mount options, leaving us with
      "huge", instead of "huge=within_size". Normally this shouldn't matter
      with the usual module load/unload flow, however with the core_hotunplug
      IGT we are hitting the following, when re-probing the memory regions:
      
      i915 0000:00:02.0: [drm] Transparent Hugepage mode 'huge'
      tmpfs: Bad value for 'huge'
      [drm] Unable to create a private tmpfs mount, hugepage support will be disabled(-22).
      
      References: https://gitlab.freedesktop.org/drm/intel/-/issues/4651Signed-off-by: default avatarMatthew Auld <matthew.auld@intel.com>
      Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
      Reviewed-by: default avatarTvrtko Ursulin <tvrtko.ursulin@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20211126110843.2028582-1-matthew.auld@intel.com
      3ccadbce
    • Thomas Hellström's avatar
      drm/i915: Use __GFP_KSWAPD_RECLAIM in the capture code · 8b91cdd4
      Thomas Hellström authored
      The capture code is typically run entirely in the fence signalling
      critical path. We're about to add lockdep annotation in an upcoming patch
      which reveals a lockdep splat similar to the below one.
      
      Fix the associated potential deadlocks using __GFP_KSWAPD_RECLAIM
      (which is the same as GFP_WAIT, but open-coded for clarity) rather than
      GFP_KERNEL for memory allocation in the capture path. This has the
      potential drawback that capture might fail in situations with memory
      pressure.
      
      [  234.842048] WARNING: possible circular locking dependency detected
      [  234.842050] 5.15.0-rc7+ #20 Tainted: G     U  W
      [  234.842052] ------------------------------------------------------
      [  234.842054] gem_exec_captur/1180 is trying to acquire lock:
      [  234.842056] ffffffffa3e51c00 (fs_reclaim){+.+.}-{0:0}, at: __kmalloc+0x4d/0x330
      [  234.842063]
                     but task is already holding lock:
      [  234.842064] ffffffffa3f57620 (dma_fence_map){++++}-{0:0}, at: i915_vma_snapshot_resource_pin+0x27/0x30 [i915]
      [  234.842138]
                     which lock already depends on the new lock.
      
      [  234.842140]
                     the existing dependency chain (in reverse order) is:
      [  234.842142]
                     -> #2 (dma_fence_map){++++}-{0:0}:
      [  234.842145]        __dma_fence_might_wait+0x41/0xa0
      [  234.842149]        dma_resv_lockdep+0x1dc/0x28f
      [  234.842151]        do_one_initcall+0x58/0x2d0
      [  234.842154]        kernel_init_freeable+0x273/0x2bf
      [  234.842157]        kernel_init+0x16/0x120
      [  234.842160]        ret_from_fork+0x1f/0x30
      [  234.842163]
                     -> #1 (mmu_notifier_invalidate_range_start){+.+.}-{0:0}:
      [  234.842166]        fs_reclaim_acquire+0x6d/0xd0
      [  234.842168]        __kmalloc_node+0x51/0x3a0
      [  234.842171]        alloc_cpumask_var_node+0x1b/0x30
      [  234.842174]        native_smp_prepare_cpus+0xc7/0x292
      [  234.842177]        kernel_init_freeable+0x160/0x2bf
      [  234.842179]        kernel_init+0x16/0x120
      [  234.842181]        ret_from_fork+0x1f/0x30
      [  234.842184]
                     -> #0 (fs_reclaim){+.+.}-{0:0}:
      [  234.842186]        __lock_acquire+0x1161/0x1dc0
      [  234.842189]        lock_acquire+0xb5/0x2b0
      [  234.842192]        fs_reclaim_acquire+0xa1/0xd0
      [  234.842193]        __kmalloc+0x4d/0x330
      [  234.842196]        i915_vma_coredump_create+0x78/0x5b0 [i915]
      [  234.842253]        intel_engine_coredump_add_vma+0x36/0xe0 [i915]
      [  234.842307]        __i915_gpu_coredump+0x290/0x5e0 [i915]
      [  234.842365]        i915_capture_error_state+0x57/0xa0 [i915]
      [  234.842415]        intel_gt_handle_error+0x348/0x3e0 [i915]
      [  234.842462]        intel_gt_debugfs_reset_store+0x3c/0x90 [i915]
      [  234.842504]        simple_attr_write+0xc1/0xe0
      [  234.842507]        full_proxy_write+0x53/0x80
      [  234.842509]        vfs_write+0xbc/0x350
      [  234.842513]        ksys_write+0x58/0xd0
      [  234.842514]        do_syscall_64+0x38/0x90
      [  234.842516]        entry_SYSCALL_64_after_hwframe+0x44/0xae
      [  234.842519]
                     other info that might help us debug this:
      
      [  234.842521] Chain exists of:
                       fs_reclaim --> mmu_notifier_invalidate_range_start --> dma_fence_map
      
      [  234.842526]  Possible unsafe locking scenario:
      
      [  234.842528]        CPU0                    CPU1
      [  234.842529]        ----                    ----
      [  234.842531]   lock(dma_fence_map);
      [  234.842532]                                lock(mmu_notifier_invalidate_range_start);
      [  234.842535]                                lock(dma_fence_map);
      [  234.842537]   lock(fs_reclaim);
      [  234.842539]
                      *** DEADLOCK ***
      
      [  234.842540] 4 locks held by gem_exec_captur/1180:
      [  234.842543]  #0: ffff9007812d9460 (sb_writers#17){.+.+}-{0:0}, at: ksys_write+0x58/0xd0
      [  234.842547]  #1: ffff900781d9ecb8 (&attr->mutex){+.+.}-{3:3}, at: simple_attr_write+0x3a/0xe0
      [  234.842552]  #2: ffffffffc11913a8 (capture_mutex){+.+.}-{3:3}, at: i915_capture_error_state+0x1a/0xa0 [i915]
      [  234.842602]  #3: ffffffffa3f57620 (dma_fence_map){++++}-{0:0}, at: i915_vma_snapshot_resource_pin+0x27/0x30 [i915]
      [  234.842656]
                     stack backtrace:
      [  234.842658] CPU: 0 PID: 1180 Comm: gem_exec_captur Tainted: G     U  W         5.15.0-rc7+ #20
      [  234.842661] Hardware name: ASUS System Product Name/PRIME B560M-A AC, BIOS 0403 01/26/2021
      [  234.842664] Call Trace:
      [  234.842666]  dump_stack_lvl+0x57/0x72
      [  234.842669]  check_noncircular+0xde/0x100
      [  234.842672]  ? __lock_acquire+0x3bf/0x1dc0
      [  234.842675]  __lock_acquire+0x1161/0x1dc0
      [  234.842678]  lock_acquire+0xb5/0x2b0
      [  234.842680]  ? __kmalloc+0x4d/0x330
      [  234.842683]  ? finish_task_switch.isra.0+0xf2/0x360
      [  234.842686]  ? i915_vma_coredump_create+0x78/0x5b0 [i915]
      [  234.842734]  fs_reclaim_acquire+0xa1/0xd0
      [  234.842737]  ? __kmalloc+0x4d/0x330
      [  234.842739]  __kmalloc+0x4d/0x330
      [  234.842742]  i915_vma_coredump_create+0x78/0x5b0 [i915]
      [  234.842793]  ? capture_vma+0xbe/0x110 [i915]
      [  234.842844]  intel_engine_coredump_add_vma+0x36/0xe0 [i915]
      [  234.842892]  __i915_gpu_coredump+0x290/0x5e0 [i915]
      [  234.842939]  i915_capture_error_state+0x57/0xa0 [i915]
      [  234.842985]  intel_gt_handle_error+0x348/0x3e0 [i915]
      [  234.843032]  ? __mutex_lock+0x81/0x830
      [  234.843035]  ? simple_attr_write+0x3a/0xe0
      [  234.843038]  ? __lock_acquire+0x3bf/0x1dc0
      [  234.843041]  intel_gt_debugfs_reset_store+0x3c/0x90 [i915]
      [  234.843083]  ? _copy_from_user+0x45/0x80
      [  234.843086]  simple_attr_write+0xc1/0xe0
      [  234.843089]  full_proxy_write+0x53/0x80
      [  234.843091]  vfs_write+0xbc/0x350
      [  234.843094]  ksys_write+0x58/0xd0
      [  234.843096]  do_syscall_64+0x38/0x90
      [  234.843098]  entry_SYSCALL_64_after_hwframe+0x44/0xae
      [  234.843101] RIP: 0033:0x7fa467480877
      [  234.843103] Code: 75 05 48 83 c4 58 c3 e8 37 4e ff ff 0f 1f 80 00 00 00 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 48 89 54 24 18 48 89 74 24
      [  234.843108] RSP: 002b:00007ffd14d79b08 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
      [  234.843112] RAX: ffffffffffffffda RBX: 00007ffd14d79b60 RCX: 00007fa467480877
      [  234.843114] RDX: 0000000000000014 RSI: 00007ffd14d79b60 RDI: 0000000000000007
      [  234.843116] RBP: 0000000000000007 R08: 0000000000000000 R09: 00007ffd14d79ab0
      [  234.843119] R10: ffffffffffffffff R11: 0000000000000246 R12: 0000000000000014
      [  234.843121] R13: 0000000000000000 R14: 00007ffd14d79b60 R15: 0000000000000005
      
      v5:
      - Use __GFP_KSWAPD_RECLAIM rather than __GFP_NOWAIT for clarity.
        (Daniel Vetter)
      v6:
      - Include an instance in execlists_capture_work().
      - Rework the commit message due to patch reordering.
      Signed-off-by: default avatarThomas Hellström <thomas.hellstrom@linux.intel.com>
      Reviewed-by: default avatarRamalingam C <ramalingam.c@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20211108174547.979714-3-thomas.hellstrom@linux.intel.com
      8b91cdd4
    • Thomas Hellström's avatar
      drm/i915: Avoid allocating a page array for the gpu coredump · e45b98ba
      Thomas Hellström authored
      The gpu coredump typically takes place in a dma_fence signalling
      critical path, and hence can't use GFP_KERNEL allocations, as that
      means we might hit deadlocks under memory pressure. However
      changing to __GFP_KSWAPD_RECLAIM which will be done in an upcoming
      patch will instead mean a lower chance of the allocation succeeding.
      In particular large contigous allocations like the coredump page
      vector.
      Remove the page vector in favor of a linked list of single pages.
      Use the page lru list head as the list link, as the page owner is
      allowed to do that.
      Signed-off-by: default avatarThomas Hellström <thomas.hellstrom@linux.intel.com>
      Reviewed-by: default avatarRamalingam C <ramalingam.c@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20211108174547.979714-2-thomas.hellstrom@linux.intel.com
      e45b98ba
  9. 25 Nov, 2021 8 commits
  10. 24 Nov, 2021 1 commit
  11. 23 Nov, 2021 1 commit
    • Tejas Upadhyay's avatar
      drm/i915/gt: Hold RPM wakelock during PXP suspend · d22d446f
      Tejas Upadhyay authored
      selftest --r live shows failure in suspend tests when
      RPM wakelock is not acquired during suspend.
      
      This changes addresses below error :
      <4> [154.177535] RPM wakelock ref not held during HW access
      <4> [154.177575] WARNING: CPU: 4 PID: 5772 at
      drivers/gpu/drm/i915/intel_runtime_pm.h:113
      fwtable_write32+0x240/0x320 [i915]
      <4> [154.177974] Modules linked in: i915(+) vgem drm_shmem_helper
      fuse snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic
      ledtrig_audio mei_hdcp mei_pxp x86_pkg_temp_thermal coretemp
      crct10dif_pclmul crc32_pclmul ghash_clmulni_intel snd_intel_dspcfg
      snd_hda_codec snd_hwdep igc snd_hda_core ttm mei_me ptp
      snd_pcm prime_numbers mei i2c_i801 pps_core i2c_smbus intel_lpss_pci
      btusb btrtl btbcm btintel bluetooth ecdh_generic ecc [last unloaded: i915]
      <4> [154.178143] CPU: 4 PID: 5772 Comm: i915_selftest Tainted: G
      U            5.15.0-rc6-CI-Patchwork_21432+ #1
      <4> [154.178154] Hardware name: ASUS System Product Name/TUF GAMING
      Z590-PLUS WIFI, BIOS 0811 04/06/2021
      <4> [154.178160] RIP: 0010:fwtable_write32+0x240/0x320 [i915]
      <4> [154.178604] Code: 15 7b e1 0f 0b e9 34 fe ff ff 80 3d a9 89 31
      00 00 0f 85 31 fe ff ff 48 c7 c7 88 9e 4f a0 c6 05 95 89 31 00 01 e8
      c0 15 7b e1 <0f> 0b e9 17 fe ff ff 8b 05 0f 83 58 e2 85 c0 0f 85 8d
      00 00 00 48
      <4> [154.178614] RSP: 0018:ffffc900016279f0 EFLAGS: 00010286
      <4> [154.178626] RAX: 0000000000000000 RBX: ffff888204fe0ee0
      RCX: 0000000000000001
      <4> [154.178634] RDX: 0000000080000001 RSI: ffffffff823142b5
      RDI: 00000000ffffffff
      <4> [154.178641] RBP: 00000000000320f0 R08: 0000000000000000
      R09: c0000000ffffcd5a
      <4> [154.178647] R10: 00000000000f8c90 R11: ffffc90001627808
      R12: 0000000000000000
      <4> [154.178654] R13: 0000000040000000 R14: ffffffffa04d12e0
      R15: 0000000000000000
      <4> [154.178660] FS:  00007f7390aa4c00(0000) GS:ffff88844f000000(0000)
      knlGS:0000000000000000
      <4> [154.178669] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      <4> [154.178675] CR2: 000055bc40595028 CR3: 0000000204474005
      CR4: 0000000000770ee0
      <4> [154.178682] PKRU: 55555554
      <4> [154.178687] Call Trace:
      <4> [154.178706]  intel_pxp_fini_hw+0x23/0x30 [i915]
      <4> [154.179284]  intel_pxp_suspend+0x1f/0x30 [i915]
      <4> [154.179807]  live_gt_resume+0x5b/0x90 [i915]
      
      Changes since V2 :
      	- Remove boolean in intel_pxp_runtime_preapre for
      	  non-pxp configs. Solves build error
      Changes since V2 :
      	- Open-code intel_pxp_runtime_suspend - Daniele
      	- Remove boolean in intel_pxp_runtime_preapre - Daniele
      Changes since V1 :
      	- split the HW access parts in gt_suspend_late - Daniele
      	- Remove default PXP configs
      Signed-off-by: default avatarTejas Upadhyay <tejaskumarx.surendrakumar.upadhyay@intel.com>
      Reviewed-by: default avatarDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
      Fixes: 0cfab4cb ("drm/i915/pxp: Enable PXP power management")
      Signed-off-by: default avatarDaniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20211117060321.3729343-1-tejaskumarx.surendrakumar.upadhyay@intel.com
      d22d446f