Multithreaded Rendering - Practical Rendering and Computation with Direct3D 11

Graphics Reference

In-Depth Information

buffers), and states needed to render that terrain page. For each frame to be rendered, the

command list of every visible terrain pages would be executed on the immediate context.

These command lists could potentially be reused, since the view matrix required for ren-

dering (or some combination of matrices including the view matrix) would be updated in

a constant buffer, whose contents are not included in the command list. Only the reference

to the constant buffer would be included in the list, not its contents.

As the viewer moves around in the scene, the worker threads could dynamically

load new terrain pages from disk, as needed. Using their deferred contexts, they could add

the Map/UnMap sequences for updating the terrain page's vertex buffers with the new data

into their command list sequences. Then, the next time that particular terrain page is ren-

dered, the vertex buffer resource could be updated and made available for rendering. This

provides a simple approach to updating the terrain pages, with minimal synchronization

required between the main rendering thread and the worker threads.

7.5.2 Multithreaded Shader Creation

The terrain paging system discussed in the previous section would certainly benefit from

being run on a multithreaded system. However, not all applications need to use a data set

that is so large that it must be paged into memory dynamically. However, there are some

tasks that most, if not all, applications need to implement, and that could also benefit from

multithreaded resource creation. Our next example fits into this broader context—creating

shader objects at startup to be used during rendering. This seems like a somewhat trivial

task, but depending on the number of shader programs your application will use, as well

as their complexity, and the situations that they need to be used in, a "combinatorial explo-

sion" can produce a very large number of required shader objects. The time required to

create all of these objects can easily become unmanageable.

Creating a shader program requires two steps. First, the shader source code must

be compiled to a byte-code format. Then, that byte code is submitted to the free-threaded

device methods to create a shader program for a given programmable pipeline stage (ad-

ditional details about shader creation are available in Chapter 6). Since shader compilation

is relatively CPU intensive, it presents a good opportunity for parallelization when more

than one shader must be compiled on a system with more than one CPU core.

In the second step, the free-threaded methods of the device allow multiple threads to

simultaneously create shader objects after the compilation step. This eliminates the need

to synchronize multiple threads to create the shader objects, which would have been re-

quired in older versions of Direct3D. A simple implementation to allow parallel loading

of shader programs would create one worker thread for each available processing core.

This arrangement is essentially a thread pool, a concept that may be familiar to the reader

from standard multithreading designs. Then the list of shader programs to be loaded, and

Search WWH ::

Custom Search

Home