• Guenter Roeck's avatar
    lib/list_debug.c: Detect uninitialized lists · 0cc011c5
    Guenter Roeck authored
    In some circumstances, attempts are made to add entries to or to remove
    entries from an uninitialized list.  A prime example is
    amdgpu_bo_vm_destroy(): It is indirectly called from
    ttm_bo_init_reserved() if that function fails, and tries to remove an
    entry from a list.  However, that list is only initialized in
    amdgpu_bo_create_vm() after the call to ttm_bo_init_reserved() returned
    success.  This results in crashes such as
    
     BUG: kernel NULL pointer dereference, address: 0000000000000000
     #PF: supervisor read access in kernel mode
     #PF: error_code(0x0000) - not-present page
     PGD 0 P4D 0
     Oops: 0000 [#1] PREEMPT SMP NOPTI
     CPU: 1 PID: 1479 Comm: chrome Not tainted 5.10.110-15768-g29a72e65dae5
     Hardware name: Google Grunt/Grunt, BIOS Google_Grunt.11031.149.0 07/15/2020
     RIP: 0010:__list_del_entry_valid+0x26/0x7d
     ...
     Call Trace:
      amdgpu_bo_vm_destroy+0x48/0x8b
      ttm_bo_init_reserved+0x1d7/0x1e0
      amdgpu_bo_create+0x212/0x476
      ? amdgpu_bo_user_destroy+0x23/0x23
      ? kmem_cache_alloc+0x60/0x271
      amdgpu_bo_create_vm+0x40/0x7d
      amdgpu_vm_pt_create+0xe8/0x24b
     ...
    
    Check if the list's prev and next pointers are NULL to catch such problems.
    
    Link: https://lkml.kernel.org/r/20220531222951.92073-1-linux@roeck-us.netSigned-off-by: default avatarGuenter Roeck <linux@roeck-us.net>
    Cc: Steven Rostedt <rostedt@goodmis.org>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    0cc011c5
list_debug.c 2.21 KB