1. 15 Apr, 2024 9 commits
    • Lyude Paul's avatar
      drm/nouveau/dp: Don't probe eDP ports twice harder · bf52d7f9
      Lyude Paul authored
      I didn't pay close enough attention the last time I tried to fix this
      problem - while we currently do correctly take care to make sure we don't
      probe a connected eDP port more then once, we don't do the same thing for
      eDP ports we found to be disconnected.
      
      So, fix this and make sure we only ever probe eDP ports once and then leave
      them at that connector state forever (since without HPD, it's not going to
      change on its own anyway). This should get rid of the last few GSP errors
      getting spit out during runtime suspend and resume on some machines, as we
      tried to reprobe eDP ports in response to ACPI hotplug probe events.
      Signed-off-by: default avatarLyude Paul <lyude@redhat.com>
      Reviewed-by: default avatarDave Airlie <airlied@redhat.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20240404233736.7946-3-lyude@redhat.com
      (cherry picked from commit fe6660b6)
      bf52d7f9
    • Lyude Paul's avatar
      drm/nouveau/kms/nv50-: Disable AUX bus for disconnected DP ports · ee7e980d
      Lyude Paul authored
      GSP has its own state for keeping track of whether or not a given display
      connector is plugged in or not, and enforces this state on the driver. In
      particular, AUX transactions on a DisplayPort connector which GSP says is
      disconnected can never succeed - and can in some cases even cause
      unexpected timeouts, which can trickle up to cause other problems. A good
      example of this is runtime power management: where we can actually get
      stuck trying to resume the GPU if a userspace application like fwupd tries
      accessing a drm_aux_dev for a disconnected port. This was an issue I hit a
      few times with my Slimbook Executive 16 - where trying to offload something
      to the discrete GPU would wake it up, and then potentially cause it to
      timeout as fwupd tried to immediately access the dp_aux_dev nodes for
      nouveau.
      
      Likewise: we don't really have any cases I know of where we'd want to
      ignore this state and try an aux transaction anyway - and failing pointless
      aux transactions immediately can even speed things up. So - let's start
      enabling/disabling the aux bus in nouveau_dp_detect() to fix this. We
      enable the aux bus during connector probing, and leave it enabled if we
      discover something is actually on the connector. Otherwise, we just shut it
      off.
      
      This should fix some people's runtime PM issues (like myself), and also get
      rid of quite of a lot of GSP error spam in dmesg.
      Signed-off-by: default avatarLyude Paul <lyude@redhat.com>
      Reviewed-by: default avatarDave Airlie <airlied@redhat.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20240404233736.7946-2-lyude@redhat.com
      (cherry picked from commit 9c8a10bf)
      ee7e980d
    • Maíra Canal's avatar
      drm/v3d: Don't increment `enabled_ns` twice · 35f4f8c9
      Maíra Canal authored
      The commit 509433d8 ("drm/v3d: Expose the total GPU usage stats on sysfs")
      introduced the calculation of global GPU stats. For the regards, it used
      the already existing infrastructure provided by commit 09a93cc4 ("drm/v3d:
      Implement show_fdinfo() callback for GPU usage stats"). While adding
      global GPU stats calculation ability, the author forgot to delete the
      existing one.
      
      Currently, the value of `enabled_ns` is incremented twice by the end of
      the job, when it should be added just once. Therefore, delete the
      leftovers from commit 509433d8 ("drm/v3d: Expose the total GPU usage
      stats on sysfs").
      
      Fixes: 509433d8 ("drm/v3d: Expose the total GPU usage stats on sysfs")
      Reported-by: default avatarTvrtko Ursulin <tursulin@igalia.com>
      Signed-off-by: default avatarMaíra Canal <mcanal@igalia.com>
      Reviewed-by: default avatarTvrtko Ursulin <tvrtko.ursulin@igalia.com>
      Reviewed-by: default avatarJose Maria Casanova Crespo <jmcasanova@igalia.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20240403203517.731876-2-mcanal@igalia.com
      35f4f8c9
    • Zack Rusin's avatar
      drm/vmwgfx: Sort primary plane formats by order of preference · d4c972bf
      Zack Rusin authored
      The table of primary plane formats wasn't sorted at all, leading to
      applications picking our least desirable formats by defaults.
      
      Sort the primary plane formats according to our order of preference.
      
      Nice side-effect of this change is that it makes IGT's kms_atomic
      plane-invalid-params pass because the test picks the first format
      which for vmwgfx was DRM_FORMAT_XRGB1555 and uses fb's with odd sizes
      which make Pixman, which IGT depends on assert due to the fact that our
      16bpp formats aren't 32 bit aligned like Pixman requires all formats
      to be.
      Signed-off-by: default avatarZack Rusin <zack.rusin@broadcom.com>
      Fixes: 36cc79bc ("drm/vmwgfx: Add universal plane support")
      Cc: Broadcom internal kernel review list <bcm-kernel-feedback-list@broadcom.com>
      Cc: dri-devel@lists.freedesktop.org
      Cc: <stable@vger.kernel.org> # v4.12+
      Acked-by: default avatarPekka Paalanen <pekka.paalanen@collabora.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20240412025511.78553-6-zack.rusin@broadcom.com
      d4c972bf
    • Zack Rusin's avatar
      drm/vmwgfx: Fix crtc's atomic check conditional · a60ccade
      Zack Rusin authored
      The conditional was supposed to prevent enabling of a crtc state
      without a set primary plane. Accidently it also prevented disabling
      crtc state with a set primary plane. Neither is correct.
      
      Fix the conditional and just driver-warn when a crtc state has been
      enabled without a primary plane which will help debug broken userspace.
      
      Fixes IGT's kms_atomic_interruptible and kms_atomic_transition tests.
      Signed-off-by: default avatarZack Rusin <zack.rusin@broadcom.com>
      Fixes: 06ec4190 ("drm/vmwgfx: Add and connect CRTC helper functions")
      Cc: Broadcom internal kernel review list <bcm-kernel-feedback-list@broadcom.com>
      Cc: dri-devel@lists.freedesktop.org
      Cc: <stable@vger.kernel.org> # v4.12+
      Reviewed-by: default avatarIan Forbes <ian.forbes@broadcom.com>
      Reviewed-by: default avatarMartin Krastev <martin.krastev@broadcom.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20240412025511.78553-5-zack.rusin@broadcom.com
      a60ccade
    • Zack Rusin's avatar
      drm/vmwgfx: Fix prime import/export · b32233ac
      Zack Rusin authored
      vmwgfx never supported prime import of external buffers. Furthermore the
      driver exposes two different objects to userspace: vmw_surface's and
      gem buffers but prime import/export only worked with vmw_surfaces.
      
      Because gem buffers are used through the dumb_buffer interface this meant
      that the driver created buffers couldn't have been prime exported or
      imported.
      
      Fix prime import/export. Makes IGT's kms_prime pass.
      Signed-off-by: default avatarZack Rusin <zack.rusin@broadcom.com>
      Fixes: 8afa13a0 ("drm/vmwgfx: Implement DRIVER_GEM")
      Cc: <stable@vger.kernel.org> # v6.6+
      Reviewed-by: default avatarMartin Krastev <martin.krastev@broadcom.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20240412025511.78553-4-zack.rusin@broadcom.com
      b32233ac
    • Christian König's avatar
      drm/ttm: stop pooling cached NUMA pages v2 · b6976f32
      Christian König authored
      We only pool write combined and uncached allocations because they
      require extra overhead on allocation and release.
      
      If we also pool cached NUMA it not only means some extra unnecessary
      overhead, but also that under memory pressure it can happen that
      pages from the wrong NUMA node enters the pool and are re-used
      over and over again.
      
      This can lead to performance reduction after running into memory
      pressure.
      
      v2: restructure and cleanup the code a bit from the internal hack to
          test this.
      Signed-off-by: default avatarChristian König <christian.koenig@amd.com>
      Fixes: 4482d3c9 ("drm/ttm: add NUMA node id to the pool")
      CC: stable@vger.kernel.org
      Reviewed-by: default avatarFelix Kuehling <felix.kuehling@amd.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20240415134821.1919-1-christian.koenig@amd.com
      b6976f32
    • Mikhail Kobuk's avatar
      drm: nv04: Fix out of bounds access · cf92bb77
      Mikhail Kobuk authored
      When Output Resource (dcb->or) value is assigned in
      fabricate_dcb_output(), there may be out of bounds access to
      dac_users array in case dcb->or is zero because ffs(dcb->or) is
      used as index there.
      The 'or' argument of fabricate_dcb_output() must be interpreted as a
      number of bit to set, not value.
      
      Utilize macros from 'enum nouveau_or' in calls instead of hardcoding.
      
      Found by Linux Verification Center (linuxtesting.org) with SVACE.
      
      Fixes: 2e5702af ("drm/nouveau: fabricate DCB encoder table for iMac G4")
      Fixes: 670820c0 ("drm/nouveau: Workaround incorrect DCB entry on a GeForce3 Ti 200.")
      Signed-off-by: default avatarMikhail Kobuk <m.kobuk@ispras.ru>
      Signed-off-by: default avatarDanilo Krummrich <dakr@redhat.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20240411110854.16701-1-m.kobuk@ispras.ru
      cf92bb77
    • Dave Airlie's avatar
      nouveau: fix instmem race condition around ptr stores · fff1386c
      Dave Airlie authored
      Running a lot of VK CTS in parallel against nouveau, once every
      few hours you might see something like this crash.
      
      BUG: kernel NULL pointer dereference, address: 0000000000000008
      PGD 8000000114e6e067 P4D 8000000114e6e067 PUD 109046067 PMD 0
      Oops: 0000 [#1] PREEMPT SMP PTI
      CPU: 7 PID: 53891 Comm: deqp-vk Not tainted 6.8.0-rc6+ #27
      Hardware name: Gigabyte Technology Co., Ltd. Z390 I AORUS PRO WIFI/Z390 I AORUS PRO WIFI-CF, BIOS F8 11/05/2021
      RIP: 0010:gp100_vmm_pgt_mem+0xe3/0x180 [nouveau]
      Code: c7 48 01 c8 49 89 45 58 85 d2 0f 84 95 00 00 00 41 0f b7 46 12 49 8b 7e 08 89 da 42 8d 2c f8 48 8b 47 08 41 83 c7 01 48 89 ee <48> 8b 40 08 ff d0 0f 1f 00 49 8b 7e 08 48 89 d9 48 8d 75 04 48 c1
      RSP: 0000:ffffac20c5857838 EFLAGS: 00010202
      RAX: 0000000000000000 RBX: 00000000004d8001 RCX: 0000000000000001
      RDX: 00000000004d8001 RSI: 00000000000006d8 RDI: ffffa07afe332180
      RBP: 00000000000006d8 R08: ffffac20c5857ad0 R09: 0000000000ffff10
      R10: 0000000000000001 R11: ffffa07af27e2de0 R12: 000000000000001c
      R13: ffffac20c5857ad0 R14: ffffa07a96fe9040 R15: 000000000000001c
      FS:  00007fe395eed7c0(0000) GS:ffffa07e2c980000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 0000000000000008 CR3: 000000011febe001 CR4: 00000000003706f0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
      
      ...
      
       ? gp100_vmm_pgt_mem+0xe3/0x180 [nouveau]
       ? gp100_vmm_pgt_mem+0x37/0x180 [nouveau]
       nvkm_vmm_iter+0x351/0xa20 [nouveau]
       ? __pfx_nvkm_vmm_ref_ptes+0x10/0x10 [nouveau]
       ? __pfx_gp100_vmm_pgt_mem+0x10/0x10 [nouveau]
       ? __pfx_gp100_vmm_pgt_mem+0x10/0x10 [nouveau]
       ? __lock_acquire+0x3ed/0x2170
       ? __pfx_gp100_vmm_pgt_mem+0x10/0x10 [nouveau]
       nvkm_vmm_ptes_get_map+0xc2/0x100 [nouveau]
       ? __pfx_nvkm_vmm_ref_ptes+0x10/0x10 [nouveau]
       ? __pfx_gp100_vmm_pgt_mem+0x10/0x10 [nouveau]
       nvkm_vmm_map_locked+0x224/0x3a0 [nouveau]
      
      Adding any sort of useful debug usually makes it go away, so I hand
      wrote the function in a line, and debugged the asm.
      
      Every so often pt->memory->ptrs is NULL. This ptrs ptr is set in
      the nv50_instobj_acquire called from nvkm_kmap.
      
      If Thread A and Thread B both get to nv50_instobj_acquire around
      the same time, and Thread A hits the refcount_set line, and in
      lockstep thread B succeeds at refcount_inc_not_zero, there is a
      chance the ptrs value won't have been stored since refcount_set
      is unordered. Force a memory barrier here, I picked smp_mb, since
      we want it on all CPUs and it's write followed by a read.
      
      v2: use paired smp_rmb/smp_wmb.
      
      Cc: <stable@vger.kernel.org>
      Fixes: be55287a ("drm/nouveau/imem/nv50: embed nvkm_instobj directly into nv04_instobj")
      Signed-off-by: default avatarDave Airlie <airlied@redhat.com>
      Signed-off-by: default avatarDanilo Krummrich <dakr@redhat.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20240411011510.2546857-1-airlied@gmail.com
      fff1386c
  2. 09 Apr, 2024 1 commit
  3. 08 Apr, 2024 10 commits
  4. 05 Apr, 2024 4 commits
  5. 04 Apr, 2024 1 commit
  6. 01 Apr, 2024 1 commit
  7. 28 Mar, 2024 6 commits
  8. 27 Mar, 2024 1 commit
    • Jocelyn Falempe's avatar
      drm/vmwgfx: Create debugfs ttm_resource_manager entry only if needed · 4be9075f
      Jocelyn Falempe authored
      The driver creates /sys/kernel/debug/dri/0/mob_ttm even when the
      corresponding ttm_resource_manager is not allocated.
      This leads to a crash when trying to read from this file.
      
      Add a check to create mob_ttm, system_mob_ttm, and gmr_ttm debug file
      only when the corresponding ttm_resource_manager is allocated.
      
      crash> bt
      PID: 3133409  TASK: ffff8fe4834a5000  CPU: 3    COMMAND: "grep"
       #0 [ffffb954506b3b20] machine_kexec at ffffffffb2a6bec3
       #1 [ffffb954506b3b78] __crash_kexec at ffffffffb2bb598a
       #2 [ffffb954506b3c38] crash_kexec at ffffffffb2bb68c1
       #3 [ffffb954506b3c50] oops_end at ffffffffb2a2a9b1
       #4 [ffffb954506b3c70] no_context at ffffffffb2a7e913
       #5 [ffffb954506b3cc8] __bad_area_nosemaphore at ffffffffb2a7ec8c
       #6 [ffffb954506b3d10] do_page_fault at ffffffffb2a7f887
       #7 [ffffb954506b3d40] page_fault at ffffffffb360116e
          [exception RIP: ttm_resource_manager_debug+0x11]
          RIP: ffffffffc04afd11  RSP: ffffb954506b3df0  RFLAGS: 00010246
          RAX: ffff8fe41a6d1200  RBX: 0000000000000000  RCX: 0000000000000940
          RDX: 0000000000000000  RSI: ffffffffc04b4338  RDI: 0000000000000000
          RBP: ffffb954506b3e08   R8: ffff8fee3ffad000   R9: 0000000000000000
          R10: ffff8fe41a76a000  R11: 0000000000000001  R12: 00000000ffffffff
          R13: 0000000000000001  R14: ffff8fe5bb6f3900  R15: ffff8fe41a6d1200
          ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
       #8 [ffffb954506b3e00] ttm_resource_manager_show at ffffffffc04afde7 [ttm]
       #9 [ffffb954506b3e30] seq_read at ffffffffb2d8f9f3
          RIP: 00007f4c4eda8985  RSP: 00007ffdbba9e9f8  RFLAGS: 00000246
          RAX: ffffffffffffffda  RBX: 000000000037e000  RCX: 00007f4c4eda8985
          RDX: 000000000037e000  RSI: 00007f4c41573000  RDI: 0000000000000003
          RBP: 000000000037e000   R8: 0000000000000000   R9: 000000000037fe30
          R10: 0000000000000000  R11: 0000000000000246  R12: 00007f4c41573000
          R13: 0000000000000003  R14: 00007f4c41572010  R15: 0000000000000003
          ORIG_RAX: 0000000000000000  CS: 0033  SS: 002b
      Signed-off-by: default avatarJocelyn Falempe <jfalempe@redhat.com>
      Fixes: af4a25bb ("drm/vmwgfx: Add debugfs entries for various ttm resource managers")
      Cc: <stable@vger.kernel.org>
      Reviewed-by: default avatarZack Rusin <zack.rusin@broadcom.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20240312093551.196609-1-jfalempe@redhat.com
      4be9075f
  9. 26 Mar, 2024 1 commit
  10. 25 Mar, 2024 2 commits
  11. 24 Mar, 2024 4 commits
    • Linus Torvalds's avatar
      Linux 6.9-rc1 · 4cece764
      Linus Torvalds authored
      4cece764
    • Linus Torvalds's avatar
      Merge tag 'efi-fixes-for-v6.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi · ab8de2db
      Linus Torvalds authored
      Pull EFI fixes from Ard Biesheuvel:
      
       - Fix logic that is supposed to prevent placement of the kernel image
         below LOAD_PHYSICAL_ADDR
      
       - Use the firmware stack in the EFI stub when running in mixed mode
      
       - Clear BSS only once when using mixed mode
      
       - Check efi.get_variable() function pointer for NULL before trying to
         call it
      
      * tag 'efi-fixes-for-v6.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi:
        efi: fix panic in kdump kernel
        x86/efistub: Don't clear BSS twice in mixed mode
        x86/efistub: Call mixed mode boot services on the firmware's stack
        efi/libstub: fix efi_random_alloc() to allocate memory at alloc_min or higher address
      ab8de2db
    • Linus Torvalds's avatar
      Merge tag 'x86-urgent-2024-03-24' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 5e74df2f
      Linus Torvalds authored
      Pull x86 fixes from Thomas Gleixner:
      
       - Ensure that the encryption mask at boot is properly propagated on
         5-level page tables, otherwise the PGD entry is incorrectly set to
         non-encrypted, which causes system crashes during boot.
      
       - Undo the deferred 5-level page table setup as it cannot work with
         memory encryption enabled.
      
       - Prevent inconsistent XFD state on CPU hotplug, where the MSR is reset
         to the default value but the cached variable is not, so subsequent
         comparisons might yield the wrong result and as a consequence the
         result prevents updating the MSR.
      
       - Register the local APIC address only once in the MPPARSE enumeration
         to prevent triggering the related WARN_ONs() in the APIC and topology
         code.
      
       - Handle the case where no APIC is found gracefully by registering a
         fake APIC in the topology code. That makes all related topology
         functions work correctly and does not affect the actual APIC driver
         code at all.
      
       - Don't evaluate logical IDs during early boot as the local APIC IDs
         are not yet enumerated and the invoked function returns an error
         code. Nothing requires the logical IDs before the final CPUID
         enumeration takes place, which happens after the enumeration.
      
       - Cure the fallout of the per CPU rework on UP which misplaced the
         copying of boot_cpu_data to per CPU data so that the final update to
         boot_cpu_data got lost which caused inconsistent state and boot
         crashes.
      
       - Use copy_from_kernel_nofault() in the kprobes setup as there is no
         guarantee that the address can be safely accessed.
      
       - Reorder struct members in struct saved_context to work around another
         kmemleak false positive
      
       - Remove the buggy code which tries to update the E820 kexec table for
         setup_data as that is never passed to the kexec kernel.
      
       - Update the resource control documentation to use the proper units.
      
       - Fix a Kconfig warning observed with tinyconfig
      
      * tag 'x86-urgent-2024-03-24' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/boot/64: Move 5-level paging global variable assignments back
        x86/boot/64: Apply encryption mask to 5-level pagetable update
        x86/cpu: Add model number for another Intel Arrow Lake mobile processor
        x86/fpu: Keep xfd_state in sync with MSR_IA32_XFD
        Documentation/x86: Document that resctrl bandwidth control units are MiB
        x86/mpparse: Register APIC address only once
        x86/topology: Handle the !APIC case gracefully
        x86/topology: Don't evaluate logical IDs during early boot
        x86/cpu: Ensure that CPU info updates are propagated on UP
        kprobes/x86: Use copy_from_kernel_nofault() to read from unsafe address
        x86/pm: Work around false positive kmemleak report in msr_build_context()
        x86/kexec: Do not update E820 kexec table for setup_data
        x86/config: Fix warning for 'make ARCH=x86_64 tinyconfig'
      5e74df2f
    • Linus Torvalds's avatar
      Merge tag 'sched-urgent-2024-03-24' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · b136f68e
      Linus Torvalds authored
      Pull scheduler doc clarification from Thomas Gleixner:
       "A single update for the documentation of the base_slice_ns tunable to
        clarify that any value which is less than the tick slice has no effect
        because the scheduler tick is not guaranteed to happen within the set
        time slice"
      
      * tag 'sched-urgent-2024-03-24' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        sched/doc: Update documentation for base_slice_ns and CONFIG_HZ relation
      b136f68e