• Francisco Jerez's avatar
    drm/i915: Remove WaDisableLSQCROPERFforOCL KBL workaround. · b59dd202
    Francisco Jerez authored
    commit 4fc020d8 upstream.
    
    The WaDisableLSQCROPERFforOCL workaround has the side effect of
    disabling an L3SQ optimization that has huge performance implications
    and is unlikely to be necessary for the correct functioning of usual
    graphic workloads.  Userspace is free to re-enable the workaround on
    demand, and is generally in a better position to determine whether the
    workaround is necessary than the DRM is (e.g. only during the
    execution of compute kernels that rely on both L3 fences and HDC R/W
    requests).
    
    The same workaround seems to apply to BDW (at least to production
    stepping G1) and SKL as well (the internal workaround database claims
    that it does for all steppings, while the BSpec workaround table only
    mentions pre-production steppings), but the DRM doesn't do anything
    beyond whitelisting the L3SQCREG4 register so userspace can enable it
    when it sees fit.  Do the same on KBL platforms.
    
    Improves performance of the GFXBench4 gl_manhattan31 benchmark by 60%,
    and gl_4 (AKA car chase) by 14% on a KBL GT2 running Mesa master --
    This is followed by a regression of 35% and 10% respectively for the
    same benchmarks and platform caused by my recent patch series
    switching userspace to use the dataport constant cache instead of the
    sampler to implement uniform pull constant loads, which caused us to
    hit more heavily the L3 cache (and on platforms other than KBL had the
    opposite effect of improving performance of the same two benchmarks).
    The overall effect on KBL of this change combined with the recent
    userspace change is respectively 4.6% and 2.6%.  SynMark2 OglShMapPcf
    was affected by the constant cache changes (though it improved as it
    did on other platforms rather than regressing), but is not
    significantly affected by this patch (with statistical significance of
    5% and sample size 20).
    
    v2: Drop some more code to avoid unused variable warning.
    
    Fixes: 738fa1b3 ("drm/i915/kbl: Add WaDisableLSQCROPERFforOCL")
    Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99256Signed-off-by: default avatarFrancisco Jerez <currojerez@riseup.net>
    Cc: Matthew Auld <matthew.william.auld@gmail.com>
    Cc: Eero Tamminen <eero.t.tamminen@intel.com>
    Cc: Jani Nikula <jani.nikula@intel.com>
    Cc: Mika Kuoppala <mika.kuoppala@intel.com>
    Cc: beignet@lists.freedesktop.org
    Reviewed-by: default avatarMika Kuoppala <mika.kuoppala@intel.com>
    [Removed double Fixes tag]
    Signed-off-by: default avatarMika Kuoppala <mika.kuoppala@intel.com>
    Link: http://patchwork.freedesktop.org/patch/msgid/1484217894-20505-1-git-send-email-mika.kuoppala@intel.com
    (cherry picked from commit 8726f2fa)
    Signed-off-by: default avatarJani Nikula <jani.nikula@intel.com>
    [ Francisco Jerez: Rebase on v4.9 branch. ]
    Signed-off-by: default avatarFrancisco Jerez <currojerez@riseup.net>
    Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
    b59dd202
intel_lrc.c 65.5 KB