Graphics Reference
In-Depth Information
1.7.1 3D Scene Rendering
We developed a first application for rendering transparent, textured scenes. It is
included in the companion source code ( bin/seethrough.exe ). Figure 1.3 shows a
3D rendering of a large scene with textures and transparency. It gives the timings
breakout for each pass and each technique, as well as their memory cost.
1.7.2 Benchmarking
For benchmarking we developed an application rendering transparent, front fac-
ing quads in orthographic projection. The position and depth of the quads are
randomized and change every frame. All measures are averaged over six sec-
onds of running time. We control the size and number of quads, as well as their
opacity. We use the ARB_timer_query extension to measure the time to render a
frame. This includes the Clear, Build,andRender passesaswellaschecking
for the main buffer overflow. All tests are performed on a GeForce GTX480 and
a GeForce Titan using drivers 320.49. We expect these performance numbers
to change with future driver revisions due to issues mentioned in Section 1.6.
Nevertheless, our current implementation exhibits performance levels consistent
across all techniques as well as between Fermi and Kepler.
The benchmarking framework is included in the companion source code ( bin/
benchmark.exe ). The python script runall.py launches all benchmarks.
Number of fragments. For a fixed depth complexity, the per-frame time is ex-
pected to be linear in the number of fragments. This is verified by all imple-
mentations as illustrated Figure 1.5. We measure this by rendering a number of
quads perfectly aligned on top of each other, in randomized depth order. The
number of quads controls the depth complexity. We adjust the size of the quads
to vary the number of fragments only.
Depth complexity. In this experiment we compare the overall performance for a
fixed number of fragments but a varying depth complexity. As the size of the per-
pixel lists increases, we expect a quadratic increase in frame rendering time. This
is verified Figure 1.6. The technique Pre-Open is the most severely impacted
by the increase in depth complexity. The main reason is that the sort occurs in
global memory, and each added fragment leads to a full traversal of the list via
the eviction mechanism.
Early culling. In scenes with a mix of transparent and opaque objects, early culling
fortunately limits the depth complexity per pixel. The techniques Pre-Open and
Pre-Lin both afford for early culling (see Section 1.4.2). Figure 1.7 demonstrates
the benefit of early culling. The threshold is set up to ignore all fragments after
an opacity of 0.95 is reached (1 being fully opaque).
Search WWH ::




Custom Search