• Lyude Paul's avatar
    drm/i915: Block fbdev HPD processing during suspend · fe5ec656
    Lyude Paul authored
    When resuming, we check whether or not any previously connected
    MST topologies are still present and if so, attempt to resume them. If
    this fails, we disable said MST topologies and fire off a hotplug event
    so that userspace knows to reprobe.
    
    However, sending a hotplug event involves calling
    drm_fb_helper_hotplug_event(), which in turn results in fbcon doing a
    connector reprobe in the caller's thread - something we can't do at the
    point in which i915 calls drm_dp_mst_topology_mgr_resume() since
    hotplugging hasn't been fully initialized yet.
    
    This currently causes some rather subtle but fatal issues. For example,
    on my T480s the laptop dock connected to it usually disappears during a
    suspend cycle, and comes back up a short while after the system has been
    resumed. This guarantees pretty much every suspend and resume cycle,
    drm_dp_mst_topology_mgr_set_mst(mgr, false); will be caused and in turn,
    a connector hotplug will occur. Now it's Rute Goldberg time: when the
    connector hotplug occurs, i915 reprobes /all/ of the connectors,
    including eDP. However, eDP probing requires that we power on the panel
    VDD which in turn, grabs a wakeref to the appropriate power domain on
    the GPU (on my T480s, this is the PORT_DDI_A_IO domain). This is where
    things start breaking, since this all happens before
    intel_power_domains_enable() is called we end up leaking the wakeref
    that was acquired and never releasing it later. Come next suspend/resume
    cycle, this causes us to fail to shut down the GPU properly, which
    causes it not to resume properly and die a horrible complicated death.
    
    (as a note: this only happens when there's both an eDP panel and MST
    topology connected which is removed mid-suspend. One or the other seems
    to always be OK).
    
    We could try to fix the VDD wakeref leak, but this doesn't seem like
    it's worth it at all since we aren't able to handle hotplug detection
    while resuming anyway. So, let's go with a more robust solution inspired
    by nouveau: block fbdev from handling hotplug events until we resume
    fbdev. This allows us to still send sysfs hotplug events to be handled
    later by user space while we're resuming, while also preventing us from
    actually processing any hotplug events we receive until it's safe.
    
    This fixes the wakeref leak observed on the T480s and as such, also
    fixes suspend/resume with MST topologies connected on this machine.
    
    Changes since v2:
    * Don't call drm_fb_helper_hotplug_event() under lock, do it after lock
      (Chris Wilson)
    * Don't call drm_fb_helper_hotplug_event() in
      intel_fbdev_output_poll_changed() under lock (Chris Wilson)
    * Always set ifbdev->hpd_waiting (Chris Wilson)
    Signed-off-by: default avatarLyude Paul <lyude@redhat.com>
    Fixes: 0e32b39c ("drm/i915: add DP 1.2 MST support (v0.7)")
    Cc: Todd Previte <tprevite@gmail.com>
    Cc: Dave Airlie <airlied@redhat.com>
    Cc: Jani Nikula <jani.nikula@linux.intel.com>
    Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
    Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
    Cc: Imre Deak <imre.deak@intel.com>
    Cc: intel-gfx@lists.freedesktop.org
    Cc: <stable@vger.kernel.org> # v3.17+
    Reviewed-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
    Link: https://patchwork.freedesktop.org/patch/msgid/20190129191001.442-2-lyude@redhat.com
    fe5ec656
intel_fbdev.c 23.9 KB