Graphics Reference
In-Depth Information
As seen in Figure 3.3, the input to the shader core is provided in the v# registers. 3
Since they are providing the input to the stage, they are naturally read only. When a shader
program is executed, its input data is available in the v# registers. After the data has been
read, it can be manipulated and combined with other data, and any intermediate calcula-
tions can be stored in the r# and x#[n] registers. These are called temporary registers,
and since they hold intermediate values, they are both readable and writable by a shader
program. The texture registers (t#), constant buffer registers (cb#[n]), immediate constant
buffer register (icb[index]), and unordered access registers (u#) are also available as data
sources. These registers are used to provide access to the device memory resources, as
described in Chapter 2, and they are all read only except for the unordered access registers.
Finally, the calculated values that will be passed on to the next pipeline stage are written
into the output registers (o#). When the shader program has terminated, the values stored in
the output registers are passed on to the input registers of the next stage, where the process
is repeated. A few other special purpose registers are only available in certain stages, so we
will defer discussion of them until later in the chapter.
Typically, a developer does not need to inspect the assembly listing of a compiled
shader program, unless there is a performance issue. 4 This makes understanding the details
of how the assembly instructions operate less critical. Even so, it is still helpful to have a
basic knowledge of the assembly-based world. For example, when developers define input
and output data structures for a shader program, they must be aware of the limits on how
many input and output vectors can be used for each stage. This is determined by the num-
ber of input and output registers available for that particular stage. Similarly, the available
number of constant buffers, textures, and unordered access resources is limited by their
respective registers, as well. This is very important information, and should be taken into
consideration as we proceed through each of the pipeline stage discussions.
GPU Architectures
Even with a strictly defined assembly language specification, the actual GPU hardware is
not required to directly implement the specification. There are many different architectural
implementations, which can vary widely from one vendor to the next. In fact, even con-
secutive generations of GPU hardware from the same vendor can vary significantly from
one another. This makes it incredibly difficult to predict how efficiently a given shader pro-
gram will execute on a current or future GPU. Depending on the architecture of the GPU
executing the program, one particular memory access pattern may be more efficient than
another, but the opposite could be true with a different architecture.
3 The # symbol indicates that multiple registers are available that are identified by an integer index. For ex-
ample, v0 and vl are the first two input registers available for use.
4 Details about how to compile a shader and view its assembly listing are provided in Chapter 6.
Search WWH ::




Custom Search