• Mario Kleiner's avatar
    drm/amd/display: Fix pageflip event race condition for DCN. · eb916a5a
    Mario Kleiner authored
    Commit '16f17eda ("drm/amd/display: Send vblank and user
    events at vsartup for DCN")' introduces a new way of pageflip
    completion handling for DCN, and some trouble.
    
    The current implementation introduces a race condition, which
    can cause pageflip completion events to be sent out one vblank
    too early, thereby confusing userspace and causing flicker:
    
    prepare_flip_isr():
    
    1. Pageflip programming takes the ddev->event_lock.
    2. Sets acrtc->pflip_status == AMDGPU_FLIP_SUBMITTED
    3. Releases ddev->event_lock.
    
    --> Deadline for surface address regs double-buffering passes on
        target pipe.
    
    4. dc_commit_updates_for_stream() MMIO programs the new pageflip
       into hw, but too late for current vblank.
    
    => pflip_status == AMDGPU_FLIP_SUBMITTED, but flip won't complete
       in current vblank due to missing the double-buffering deadline
       by a tiny bit.
    
    5. VSTARTUP trigger point in vblank is reached, VSTARTUP irq fires,
       dm_dcn_crtc_high_irq() gets called.
    
    6. Detects pflip_status == AMDGPU_FLIP_SUBMITTED and assumes the
       pageflip has been completed/will complete in this vblank and
       sends out pageflip completion event to userspace and resets
       pflip_status = AMDGPU_FLIP_NONE.
    
    => Flip completion event sent out one vblank too early.
    
    This behaviour has been observed during my testing with measurement
    hardware a couple of time.
    
    The commit message says that the extra flip event code was added to
    dm_dcn_crtc_high_irq() to prevent missing to send out pageflip events
    in case the pflip irq doesn't fire, because the "DCH HUBP" component
    is clock gated and doesn't fire pflip irqs in that state. Also that
    this clock gating may happen if no planes are active. This suggests
    that the problem addressed by that commit can't happen if planes
    are active.
    
    The proposed solution is therefore to only execute the extra pflip
    completion code iff the count of active planes is zero and otherwise
    leave pflip completion handling to the pflip irq handler, for a
    more race-free experience.
    
    Note that i don't know if this fixes the problem the original commit
    tried to address, as i don't know what the test scenario was. It
    does fix the observed too early pageflip events though and points
    out the problem introduced.
    
    Fixes: 16f17eda ("drm/amd/display: Send vblank and user events at vsartup for DCN")
    Reviewed-by: default avatarNicholas Kazlauskas <nicholas.kazlauskas@amd.com>
    Signed-off-by: default avatarMario Kleiner <mario.kleiner.de@gmail.com>
    Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
    eb916a5a
amdgpu_dm.c 236 KB