The Rendering Pipeline - Practical Rendering and Computation with Direct3D 11

Graphics Reference

In-Depth Information

As seen in Figure 3.3, the input to the shader core is provided in the v# registers. 3

Since they are providing the input to the stage, they are naturally read only. When a shader

program is executed, its input data is available in the v# registers. After the data has been

read, it can be manipulated and combined with other data, and any intermediate calcula-

tions can be stored in the r# and x#[n] registers. These are called temporary registers,

and since they hold intermediate values, they are both readable and writable by a shader

program. The texture registers (t#), constant buffer registers (cb#[n]), immediate constant

buffer register (icb[index]), and unordered access registers (u#) are also available as data

sources. These registers are used to provide access to the device memory resources, as

described in Chapter 2, and they are all read only except for the unordered access registers.

Finally, the calculated values that will be passed on to the next pipeline stage are written

into the output registers (o#). When the shader program has terminated, the values stored in

the output registers are passed on to the input registers of the next stage, where the process

is repeated. A few other special purpose registers are only available in certain stages, so we

will defer discussion of them until later in the chapter.

Typically, a developer does not need to inspect the assembly listing of a compiled

shader program, unless there is a performance issue. 4 This makes understanding the details

of how the assembly instructions operate less critical. Even so, it is still helpful to have a

basic knowledge of the assembly-based world. For example, when developers define input

and output data structures for a shader program, they must be aware of the limits on how

many input and output vectors can be used for each stage. This is determined by the num-

ber of input and output registers available for that particular stage. Similarly, the available

number of constant buffers, textures, and unordered access resources is limited by their

respective registers, as well. This is very important information, and should be taken into

consideration as we proceed through each of the pipeline stage discussions.

GPU Architectures

Even with a strictly defined assembly language specification, the actual GPU hardware is

not required to directly implement the specification. There are many different architectural

implementations, which can vary widely from one vendor to the next. In fact, even con-

secutive generations of GPU hardware from the same vendor can vary significantly from

one another. This makes it incredibly difficult to predict how efficiently a given shader pro-

gram will execute on a current or future GPU. Depending on the architecture of the GPU

executing the program, one particular memory access pattern may be more efficient than

another, but the opposite could be true with a different architecture.

3 The # symbol indicates that multiple registers are available that are identified by an integer index. For ex-

ample, v0 and vl are the first two input registers available for use.

4 Details about how to compile a shader and view its assembly listing are provided in Chapter 6.

Search WWH ::

Custom Search

Home