Graphics Reference
In-Depth Information
their respective types and required shader models, would be distributed among the worker
threads. Each thread would compile and create its list of shader objects in complete isola-
tion from the others, eliminating possible synchronization issues. Then the main applica-
tion thread would simply wait for all of the worker threads to complete, and would then
read the resulting objects and store them centrally for use later on.
7.5.3 Multithreaded Submissions
The previous two examples are interesting, and they provide some insight into how re-
source-intensive tasks can use some of the Direct3D 11 multithreading tools to gain some
performance advantage. However, the largest performance potential lies with the ability to
split the rendering work for a frame among several CPU cores. As described above, this
allows the overall CPU/driver costs for rendering a scene to be amortized over several
threads simultaneously, reducing the time spent to send work to the GPU. Of course, there
are many variations of how to implement this concept, and some may fit a particular appli-
cation better than others. We will discuss a few possibilities here, and try to provide some
context about why each of them would be a good choice in a particular situation.
The general scenario for the following discussion is the following. You have one
main thread that can house the immediate context, and several worker threads (one for
each CPU core in the system), each with a deferred context. When a frame needs to be
rendered, its total workload is split up in some manner among the worker threads to gener-
ate a command list. When all of the command lists are executed in the proper order on the
immediate context the final rendered image is produced. The order of execution is only
important when the contents of one resource must be modified before the resource is used.
For example, if one command list generates a shadow map, and then a second command
list uses the shadow map to render a scene, the shadow map must be generated before being
used. There are other cases where the order will not matter, such as per-object command
lists (discussed below).
Deciding how to split up the work among the threads will likely depend on the types
of scenes being rendered, as well as their contents. Let's look a little closer at three poten-
tial techniques for splitting up the scene's rendering workload. Please note that these are
only three possible techniques that could be used, and that many other variations may or
may not perform better in a given situation.
Per-View Command Lists
The first method of dividing a scene for rendering is perhaps the least intrusive of the three.
The general idea is to generate a command list for each view of a scene. In this sense, a view
can be considered one complete rendering pass in which the complete scene is rendered.
Search WWH ::




Custom Search