Graphics Reference
In-Depth Information
This leaves us with two choices when designing an algorithmwith more signif-
icant processing, especially at the pixel level. The first choice is to build a hybrid
renderer that performs some of the processing on a more general processor, such
as the host, or perhaps on a general computation API (e.g., CUDA, Direct Com-
pute, OpenCL, OpenGL Compute). Hybrid renderers typically incur the cost of
additional memory operations and the associated synchronization complexity.
The second choice is to frame the algorithm purely in terms of rasterization
operations, and make multiple rasterization passes. For example, we can't con-
veniently cast shadow rays in most hardware rendering APIs today. But we can
sample from a previously rendered shadow map.
Similar methods exist for implementing reflection, refraction, and indirect
illumination purely in terms of rasterization. These avoid much of the perfor-
mance overhead of hybrid rendering and leverage the high performance of hard-
ware rasterization. However, they may not be the most natural way of express-
ing an algorithm, and that may lead to a net inefficiency and certainly to addi-
tional software complexity. Recall that changing the order of iteration from ray
casting to rasterization increased the space demands of rendering by requiring
a depth buffer to store intermediate results. In general, converting an arbitrary
algorithm to a rasterization-based one often has this effect. The space demands
might grow larger than is practical in cases where those intermediate results are
themselves large.
Shading languages are almost always compiled into executable code at run-
time, inside the API. That is because even within products from one vendor the
underlying micro-architecture may vary significantly. This creates a tension within
the compiler between optimizing the target code and producing the executable
quickly. Most implementations err on the side of optimization, since shaders are
often loaded once per scene. Beware that if you synthesize or stream shaders
throughout the rendering process there may be substantial overhead.
Some languages (e.g., HLSL and CUDA) offer an initial compilation step to
an intermediate representation. This eliminates the runtime cost of parsing and
some trivial compilation operations while maintaining flexibility to optimize for
a specific device. It also allows software developers to distribute their graphics
applications without revealing the shading programs to the end-user in a human-
readable form on the file system. For closed systems with fixed specifications,
such as game consoles, it is possible to compile shading programs down to true
machine code. That is because on those systems the exact runtime device is known
at host-program compile time. However, doing so would reveal some details of the
proprietary micro-architecture, so even in this case vendors do not always choose
to have their APIs perform a complete compilation step.
15.7.2.4 Executing Draw Calls
To invoke the shaders we issue draw calls. These occur on the host side. One
typically clears the framebuffer, and then, for each mesh, performs the following
operations.
1. Set fixed function state.
2. Bind a shader.
3. Set shader arguments.
4.
Issue the draw call.
 
Search WWH ::




Custom Search