• Lyude Paul's avatar
    drm/nouveau/gsp: Use the sg allocator for level 2 of radix3 · 6f572a80
    Lyude Paul authored
    Currently we allocate all 3 levels of radix3 page tables using
    nvkm_gsp_mem_ctor(), which uses dma_alloc_coherent() for allocating all of
    the relevant memory. This can end up failing in scenarios where the system
    has very high memory fragmentation, and we can't find enough contiguous
    memory to allocate level 2 of the page table.
    
    Currently, this can result in runtime PM issues on systems where memory
    fragmentation is high - as we'll fail to allocate the page table for our
    suspend/resume buffer:
    
      kworker/10:2: page allocation failure: order:7, mode:0xcc0(GFP_KERNEL),
      nodemask=(null),cpuset=/,mems_allowed=0
      CPU: 10 PID: 479809 Comm: kworker/10:2 Not tainted
      6.8.6-201.ChopperV6.fc39.x86_64 #1
      Hardware name: SLIMBOOK Executive/Executive, BIOS N.1.10GRU06 02/02/2024
      Workqueue: pm pm_runtime_work
      Call Trace:
       <TASK>
       dump_stack_lvl+0x64/0x80
       warn_alloc+0x165/0x1e0
       ? __alloc_pages_direct_compact+0xb3/0x2b0
       __alloc_pages_slowpath.constprop.0+0xd7d/0xde0
       __alloc_pages+0x32d/0x350
       __dma_direct_alloc_pages.isra.0+0x16a/0x2b0
       dma_direct_alloc+0x70/0x270
       nvkm_gsp_radix3_sg+0x5e/0x130 [nouveau]
       r535_gsp_fini+0x1d4/0x350 [nouveau]
       nvkm_subdev_fini+0x67/0x150 [nouveau]
       nvkm_device_fini+0x95/0x1e0 [nouveau]
       nvkm_udevice_fini+0x53/0x70 [nouveau]
       nvkm_object_fini+0xb9/0x240 [nouveau]
       nvkm_object_fini+0x75/0x240 [nouveau]
       nouveau_do_suspend+0xf5/0x280 [nouveau]
       nouveau_pmops_runtime_suspend+0x3e/0xb0 [nouveau]
       pci_pm_runtime_suspend+0x67/0x1e0
       ? __pfx_pci_pm_runtime_suspend+0x10/0x10
       __rpm_callback+0x41/0x170
       ? __pfx_pci_pm_runtime_suspend+0x10/0x10
       rpm_callback+0x5d/0x70
       ? __pfx_pci_pm_runtime_suspend+0x10/0x10
       rpm_suspend+0x120/0x6a0
       pm_runtime_work+0x98/0xb0
       process_one_work+0x171/0x340
       worker_thread+0x27b/0x3a0
       ? __pfx_worker_thread+0x10/0x10
       kthread+0xe5/0x120
       ? __pfx_kthread+0x10/0x10
       ret_from_fork+0x31/0x50
       ? __pfx_kthread+0x10/0x10
       ret_from_fork_asm+0x1b/0x30
    
    Luckily, we don't actually need to allocate coherent memory for the page
    table thanks to being able to pass the GPU a radix3 page table for
    suspend/resume data. So, let's rewrite nvkm_gsp_radix3_sg() to use the sg
    allocator for level 2. We continue using coherent allocations for lvl0 and
    1, since they only take a single page.
    
    V2:
    * Don't forget to actually jump to the next scatterlist when we reach the
      end of the scatterlist we're currently on when writing out the page table
      for level 2
    Signed-off-by: default avatarLyude Paul <lyude@redhat.com>
    Cc: stable@vger.kernel.org
    Reviewed-by: default avatarBen Skeggs <bskeggs@nvidia.com>
    Link: https://patchwork.freedesktop.org/patch/msgid/20240429182318.189668-2-lyude@redhat.com
    6f572a80
r535.c 62.6 KB