• Rob Clark's avatar
    drm/msm: Hangcheck progress detection · d73b1d02
    Rob Clark authored
    If the hangcheck timer expires, check if the fw's position in the
    cmdstream has advanced (changed) since last timer expiration, and
    allow it up to three additional "extensions" to it's alotted time.
    The intention is to continue to catch "shader stuck in a loop" type
    hangs quickly, but allow more time for things that are actually
    making forward progress.
    
    Because we need to sample the CP state twice to detect if there has
    not been progress, this also cuts the the timer's duration in half.
    
    v2: Fix typo (REG_A6XX_CP_CSQ_IB2_STAT), add comment
    v3: Only halve hangcheck timer duration for generations which
        support progress detection (hdanton); removed unused a5xx
        progress (without knowing how to adjust for data buffered
        in ROQ it is too likely to report a false negative)
    v4: Comment updates to better describe the total hangcheck
        duration when progress detection is applied
    Reviewed-by: default avatarChia-I Wu <olvaffe@gmail.com>
    Tested-by: Chia-I Wu <olvaffe@gmail.com> # dEQP-GLES2.functional.flush_finish.wait
    Signed-off-by: default avatarRob Clark <robdclark@chromium.org>
    Reviewed-by: default avatarAkhil P Oommen <quic_akhilpo@quicinc.com>
    Patchwork: https://patchwork.freedesktop.org/patch/511584/
    Link: https://lore.kernel.org/r/20221114193049.1533391-3-robdclark@gmail.com
    d73b1d02
a6xx_gpu.c 63.4 KB