• Lyude Paul's avatar
    drm/nouveau/drm/nouveau: Prevent handling ACPI HPD events too early · 79e765ad
    Lyude Paul authored
    On most systems with ACPI hotplugging support, it seems that we always
    receive a hotplug event once we re-enable EC interrupts even if the GPU
    hasn't even been resumed yet.
    
    This can cause problems since even though we schedule hpd_work to handle
    connector reprobing for us, hpd_work synchronizes on
    pm_runtime_get_sync() to wait until the device is ready to perform
    reprobing. Since runtime suspend/resume callbacks are disabled before
    the PM core calls ->suspend(), any calls to pm_runtime_get_sync() during
    this period will grab a runtime PM ref and return immediately with
    -EACCES. Because we schedule hpd_work from our ACPI HPD handler, and
    hpd_work synchronizes on pm_runtime_get_sync(), this causes us to launch
    a connector reprobe immediately even if the GPU isn't actually resumed
    just yet. This causes various warnings in dmesg and occasionally, also
    prevents some displays connected to the dedicated GPU from coming back
    up after suspend. Example:
    
    usb 1-4: USB disconnect, device number 14
    usb 1-4.1: USB disconnect, device number 15
    WARNING: CPU: 0 PID: 838 at drivers/gpu/drm/nouveau/include/nvkm/subdev/i2c.h:170 nouveau_dp_detect+0x17e/0x370 [nouveau]
    CPU: 0 PID: 838 Comm: kworker/0:6 Not tainted 4.17.14-201.Lyude.bz1477182.V3.fc28.x86_64 #1
    Hardware name: LENOVO 20EQS64N00/20EQS64N00, BIOS N1EET77W (1.50 ) 03/28/2018
    Workqueue: events nouveau_display_hpd_work [nouveau]
    RIP: 0010:nouveau_dp_detect+0x17e/0x370 [nouveau]
    RSP: 0018:ffffa15143933cf0 EFLAGS: 00010293
    RAX: 0000000000000000 RBX: ffff8cb4f656c400 RCX: 0000000000000000
    RDX: ffffa1514500e4e4 RSI: ffffa1514500e4e4 RDI: 0000000001009002
    RBP: ffff8cb4f4a8a800 R08: ffffa15143933cfd R09: ffffa15143933cfc
    R10: 0000000000000000 R11: 0000000000000000 R12: ffff8cb4fb57a000
    R13: ffff8cb4fb57a000 R14: ffff8cb4f4a8f800 R15: ffff8cb4f656c418
    FS:  0000000000000000(0000) GS:ffff8cb51f400000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 00007f78ec938000 CR3: 000000073720a003 CR4: 00000000003606f0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    Call Trace:
     ? _cond_resched+0x15/0x30
     nouveau_connector_detect+0x2ce/0x520 [nouveau]
     ? _cond_resched+0x15/0x30
     ? ww_mutex_lock+0x12/0x40
     drm_helper_probe_detect_ctx+0x8b/0xe0 [drm_kms_helper]
     drm_helper_hpd_irq_event+0xa8/0x120 [drm_kms_helper]
     nouveau_display_hpd_work+0x2a/0x60 [nouveau]
     process_one_work+0x187/0x340
     worker_thread+0x2e/0x380
     ? pwq_unbound_release_workfn+0xd0/0xd0
     kthread+0x112/0x130
     ? kthread_create_worker_on_cpu+0x70/0x70
     ret_from_fork+0x35/0x40
    Code: 4c 8d 44 24 0d b9 00 05 00 00 48 89 ef ba 09 00 00 00 be 01 00 00 00 e8 e1 09 f8 ff 85 c0 0f 85 b2 01 00 00 80 7c 24 0c 03 74 02 <0f> 0b 48 89 ef e8 b8 07 f8 ff f6 05 51 1b c8 ff 02 0f 84 72 ff
    ---[ end trace 55d811b38fc8e71a ]---
    
    So, to fix this we attempt to grab a runtime PM reference in the ACPI
    handler itself asynchronously. If the GPU is already awake (it will have
    normal hotplugging at this point) or runtime PM callbacks are currently
    disabled on the device, we drop our reference without updating the
    autosuspend delay. We only schedule connector reprobes when we
    successfully managed to queue up a resume request with our asynchronous
    PM ref.
    
    This also has the added benefit of preventing redundant connector
    reprobes from ACPI while the GPU is runtime resumed!
    Signed-off-by: default avatarLyude Paul <lyude@redhat.com>
    Cc: stable@vger.kernel.org
    Cc: Karol Herbst <kherbst@redhat.com>
    Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1477182#c41Signed-off-by: default avatarLyude Paul <lyude@redhat.com>
    Signed-off-by: default avatarBen Skeggs <bskeggs@redhat.com>
    79e765ad
nouveau_display.c 25.5 KB