Ray Casting and Rasterization - Computer Graphics: Principles and Practice

Graphics Reference

In-Depth Information

12

13

14

15

16

17

18

19

20

21

22

// The depth test will run directly on the interpolated value in

// Q.z/Q.w, which is going to be smallest at the far plane

gpu->setDepthTest(RenderDevice::DEPTH_GREATER);

gpu->setDepthClearValue(0.0);

while (! done) {

loopBody(gpu);

processUserInput();

}

...

15.8 Performance and Optimization

We'll now consider several examples of optimization in hardware-based render-

ing. This is by no means an exhaustive list, but rather a set of model techniques

from which you can draw ideas to generate your own optimizations when you

need them.

15.8.1 Abstraction Considerations

Many performance optimizations will come at the price of significantly compli-

cating the implementation. Weigh the performance advantage of an optimization

against the additional cost of debugging and code maintenance. High-level algo-

rithmic optimizations may require significant thought and restructuring of code,

but they tend to yield the best tradeoff of performance for code complexity. For

example, simply dividing the screen in half and asynchronously rendering each

side on a separate processor nearly doubles performance at the cost of perhaps 50

additional lines of code that do not interact with the inner loop of the renderer.

In contrast, consider some low-level optimizations that we intentionally passed

over. These include reducing common subexpressions (e.g., mapping all of those

repeated divisions to multiplications by an inverse that is computed once) and lift-

ing constants outside loops. Performing those destroys the clarity of the algorithm,

but will probably gain only a 50% throughput improvement.

This is not to say that low-level optimizations are not worthwhile. But they are

primarily worthwhile when you have completed your high-level optimizations;

at that point you are more willing to complicate your code and its maintenance

because you are done adding features.

15.8.2 Architectural Considerations

The primary difference between the simple rasterizer and ray caster described

in this chapter is that the “for each pixel” and “for each triangle” loops have the

opposite nesting. This is a trivial change and the body of the inner loop is largely

similar in each case. But the trivial change has profound implications for memory

access patterns and how we can algorithmically optimize each.

Scene triangles are typically stored in the heap. They may be in a flat 1D

array, or arranged in a more sophisticated data structure. If they are in a simple

data structure such as an array, then we can ensure reasonable memory coherence

by iterating through them in the same order that they appear in memory. That pro-

duces efficient cache behavior. However, that iteration also requires substantial

Computer Graphics: Principles and Practice

Search WWH ::

Custom Search

Home