• Arun Siluvery's avatar
    drm/i915/gen8: Add infrastructure to initialize WA batch buffers · 17ee950d
    Arun Siluvery authored
    Some of the WA are to be applied during context save but before restore and
    some at the end of context save/restore but before executing the instructions
    in the ring, WA batch buffers are created for this purpose and these WA cannot
    be applied using normal means. Each context has two registers to load the
    offsets of these batch buffers. If they are non-zero, HW understands that it
    need to execute these batches.
    
    v1: In this version two separate ring_buffer objects were used to load WA
    instructions for indirect and per context batch buffers and they were part
    of every context.
    
    v2: Chris suggested to include additional page in context and use it to load
    these WA instead of creating separate objects. This will simplify lot of things
    as we need not explicity pin/unpin them. Thomas Daniel further pointed that GuC
    is planning to use a similar setup to share data between GuC and driver and
    WA batch buffers can probably share that page. However after discussions with
    Dave who is implementing GuC changes, he suggested to use an independent page
    for the reasons - GuC area might grow and these WA are initialized only once and
    are not changed afterwards so we can share them share across all contexts.
    
    The page is updated with WA during render ring init. This has an advantage of
    not adding more special cases to default_context.
    
    We don't know upfront the number of WA we will applying using these batch buffers.
    For this reason the size was fixed earlier but it is not a good idea. To fix this,
    the functions that load instructions are modified to report the no of commands
    inserted and the size is now calculated after the batch is updated. A macro is
    introduced to add commands to these batch buffers which also checks for overflow
    and returns error.
    We have a full page dedicated for these WA so that should be sufficient for
    good number of WA, anything more means we have major issues.
    The list for Gen8 is small, same for Gen9 also, maybe few more gets added
    going forward but not close to filling entire page. Chris suggested a two-pass
    approach but we agreed to go with single page setup as it is a one-off routine
    and simpler code wins.
    
    One additional option is offset field which is helpful if we would like to
    have multiple batches at different offsets within the page and select them
    based on some criteria. This is not a requirement at this point but could
    help in future (Dave).
    
    Chris provided some helpful macros and suggestions which further simplified
    the code, they will also help in reducing code duplication when WA for
    other Gen are added. Add detailed comments explaining restrictions.
    Use do {} while(0) for wa_ctx_emit() macro.
    
    (Many thanks to Chris, Dave and Thomas for their reviews and inputs)
    
    Cc: Chris Wilson <chris@chris-wilson.co.uk>
    Cc: Dave Gordon <david.s.gordon@intel.com>
    Signed-off-by: default avatarRafael Barbalho <rafael.barbalho@intel.com>
    Signed-off-by: default avatarArun Siluvery <arun.siluvery@linux.intel.com>
    Reviewed-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
    Signed-off-by: default avatarDaniel Vetter <daniel.vetter@ffwll.ch>
    17ee950d
intel_ringbuffer.h 15.7 KB