Graphics Reference
In-Depth Information
Our view-level multithreading granularity provides good potential for reduced CPU
overhead. Some rendering passes use very similar rendering effects for most, if not all of
the scene objects. Consider a shadow map generation pass—most objects will use the exact
same pixel shader to output the appropriate depth value, and most will also use a similar
transformation shader setup (vertex and/or tessellation based shaders) with some varia-
tions for static versus dynamic geometry. These types of rendering passes are typically
presorted at the view level, so all of the operations that a deferred context executes to set
up a rendering pass (such as setting render targets or stencil setup) will be amortized over
many draw calls.
However, due to the generally larger command lists, this view-based processing
doesn't have much of a chance to reuse command lists from frame to frame. As discussed
above, it is possible to update the dynamic state of objects by modifying the contents of
the resources used. However, this does not allow the application to change which objects
are rendered and which are culled in a given frame. This should not be seen as a critical
problem, however, since there will not be very many view-sized command lists being gen-
erated for each frame.
Per-Object Command Lists
The next level of granularity we could use is to render the scene objects at the individual
object level. In this scheme, the worker threads would generate one command list for each
object that will be rendered. This introduces a much finer level of processing and con-
sequently increases the number of command lists that must be generated and executed.
Because of larger number of command lists, it is probably advantageous to use deferred
context state propagation between command list generations. This would alleviate the need
to make more frequent calls to higher-level rendering setup functions, such as setting ren-
der targets, because so many command lists are used. The additional command lists would
also require a higher number of FinishCommandList() and ExecuteCommandList() calls,
which may or may not impact performance. Since the command lists are generated in isola-
tion from the rest of the scene, it also reduces the amount of batching that can be performed.
However, this paradigm also has some interesting side effects that could prove to
be beneficial. Once a command list is generated for a particular object for a particular
rendering pass, there is likely no reason to release and recreate the command list for every
frame. Since any per-frame dynamic rendering data, such as view or skinning matrices, is
provided to the shader programs in constant buffers, the command list will not change from
frame to frame as long as the same constant buffers are updated and used for every frame.
With no overhead for generating the command list, any additional costs discussed above
could largely be overcome. In addition, the simplicity of such a scheme would be attractive
as well—each object would simply receive its own command list and use it as necessary.
Search WWH ::




Custom Search