Multithreaded Rendering - Practical Rendering and Computation with Direct3D 11

Graphics Reference

In-Depth Information

their respective types and required shader models, would be distributed among the worker

threads. Each thread would compile and create its list of shader objects in complete isola-

tion from the others, eliminating possible synchronization issues. Then the main applica-

tion thread would simply wait for all of the worker threads to complete, and would then

read the resulting objects and store them centrally for use later on.

7.5.3 Multithreaded Submissions

The previous two examples are interesting, and they provide some insight into how re-

source-intensive tasks can use some of the Direct3D 11 multithreading tools to gain some

performance advantage. However, the largest performance potential lies with the ability to

split the rendering work for a frame among several CPU cores. As described above, this

allows the overall CPU/driver costs for rendering a scene to be amortized over several

threads simultaneously, reducing the time spent to send work to the GPU. Of course, there

are many variations of how to implement this concept, and some may fit a particular appli-

cation better than others. We will discuss a few possibilities here, and try to provide some

context about why each of them would be a good choice in a particular situation.

The general scenario for the following discussion is the following. You have one

main thread that can house the immediate context, and several worker threads (one for

each CPU core in the system), each with a deferred context. When a frame needs to be

rendered, its total workload is split up in some manner among the worker threads to gener-

ate a command list. When all of the command lists are executed in the proper order on the

immediate context the final rendered image is produced. The order of execution is only

important when the contents of one resource must be modified before the resource is used.

For example, if one command list generates a shadow map, and then a second command

list uses the shadow map to render a scene, the shadow map must be generated before being

used. There are other cases where the order will not matter, such as per-object command

lists (discussed below).

Deciding how to split up the work among the threads will likely depend on the types

of scenes being rendered, as well as their contents. Let's look a little closer at three poten-

tial techniques for splitting up the scene's rendering workload. Please note that these are

only three possible techniques that could be used, and that many other variations may or

may not perform better in a given situation.

Per-View Command Lists

The first method of dividing a scene for rendering is perhaps the least intrusive of the three.

The general idea is to generate a command list for each view of a scene. In this sense, a view

can be considered one complete rendering pass in which the complete scene is rendered.

Search WWH ::

Custom Search

Home