High Level Shading Language - Practical Rendering and Computation with Direct3D 11

Graphics Reference

In-Depth Information

lists all values returned by the shader. It also indicates if the pixel shader runs at per-sample

frequency, which is due to taking SV_SampleIndex as an input.

After the diagnostic information, the. output also contains the fully compiled shader

program in assembly. While it's not typically useful to examine the generated assembly,

it can be helpful for verification purposes during performance analysis. In particular, it is

common to look for dynamic branching or looping constructs, since these can have a drastic

effect on performance. Dynamic branches can be spotted by looking for an if_"comp" in-

struction, where comp is a two-letter abbreviation of a comparison. For instance, lt will be

used for a less-than comparison, and ge for a greater-than-or-equal comparison. Dynamic

loops will begin with a rep instruction. The output also contains an instruction count at the

end, which is simply the number of assembly instructions in the program.

While it is possible to get a very rough estimate of the relative performance cost of

a shader through this number, in general, it is not a reliable figure. This is because shader

assembly is merely an intermediate format that is further compiled by the driver into a

microcode format that can be executed by the hardware. Thus, the final program could

have a very different number of instructions, and the number of cycles required to execute

those instructions could also vary, depending on the hardware. More importantly, even the

actual microcode instructions will not properly reflect the larger-scale performance charac-

teristics caused by memory accesses and multithreaded execution. To obtain more accurate

performance statistics regarding a shader program, specialized analysis and profiling tools

are available from the major graphics hardware vendors.

Search WWH ::

Custom Search

Home