Graphics Reference
In-Depth Information
consider the “billboard” technique described in Section 14.6.2, in which
far-away complexity is simulated via a texture-mapped planar polygon.
Subdivision Surfaces / GPU Tessellation
In contrast to simplification techniques, this technique has the opposite
goal—it uses an iterative subdivision algorithm (see Section 14.5.3 and
Chapters 22 and 23) to add complexity to a coarse “base mesh” to provide
a smoother appearance. This is an optimization technique due to its use
of the GPU to perform the tessellation; the CPU side deals only with
the coarse base mesh, thus reducing use of CPU/GPU bandwidth. Since
the process is iterative, the GPU-based tessellation can adjust the amount
of smoothing work based on the primitive's distance from the viewer, thus
providing a kind of variable level-of-detail control.
16.4.2.2 Generating an Efficient Sequence of IM-Layer Instructions
to Render the Simplified Scene
The graphics hardware pipeline is a complex combination of functional units,
and achieving maximum throughput requires expert knowledge of the pipeline's
idiosyncrasies and potential bottlenecks. Certain types of operation sequences can
cause “pipeline stalls” that radically undermine performance. As the pipeline has
adapted to relieve these bottlenecks over the years, new bottlenecks have arisen,
presenting opportunities for further adaptation.
In particular, state changes (e.g., a change in the current-material state vari-
able, or a switch to a different vertex shader) disrupt pipeline throughput and
should be minimized by careful ordering of the primitive-drawing sequence. The
actual cost varies by API, hardware platform, driver software, and type of state
variable being modified. Nevertheless, as a rule, each state change should be fol-
lowed by the generation of as many primitives as possible; thus, as part of the
AMIP, logic should analyze the potentially visible set for the purpose of reorder-
ing the primitives so as to draw in a batch all primitives that require the same
state configuration. In a chess application, one might model all the black pieces as
obsidian and the white pieces as onyx. It then makes sense to render all the black
pieces before all the white ones, or vice versa.
The modification of specification order generally does not have an impact on
the final rendered image, but it should be noted that the use of translucent materials
presents an exception and does complicate this optimization task (and others, e.g.,
occlusion culling).
16.4.2.3 Using Caching to Avoid Redundant Computations
in Performance of Tasks in Categories 1 and 2
The CPU and memory resources necessary to perform the activities described
above for task categories 16.4.2.1(a-c) and 16.4.2.2 can be substantial. But many
of these activities, when executed to produce frame i , produce results that remain
useful for frame i +1, if the difference between the two frames meets certain
requirements. Thus, caching is of value in reducing the CPU cost of such activities.
Caches used for this purpose are called acceleration data structures and are
used in two distinct ways.
Cache the result of computations.
- For example, consider 16.4.2.2. The generated IM-layer instruction
sequence associated with static portions of the scene can be cached and
 
Search WWH ::




Custom Search