Graphics Reference
In-Depth Information
C
=
(30192, 15771, 9030), D
=
(29971, 15764, 9498), and E
=
(30304, 15888, 9133).
These
could
be
represented
as
(
52,
9, 121),
(51, 103, 21),
(54,
95,
234),
(
131) with a common 16-bit offset of
(30138, 15866, 9264). Here, the quantized vertices could be stored using (signed)
9-bit values, a considerable saving over the original format.
Even more space can be saved by allowing the origin to float from one vertex
to the next. That is, the displacement is computed from the previous vertex, thus
effectively implementing a delta compression scheme. Such a scheme works best
when the vertices represent an indexed mesh so that their order can be shuffled to
facilitate a better compression ratio. Rearranging the vertices A through E in the order
D , A , B , E , C gives the first vertex as (29971, 15764, 9498) and the following vertices
as (115, 93,
167,
102, 234),
and
(166, 22,
113), (103, 112,
100), (115,
81,
122), and (
112,
117,
73). The
vertex components can now be represented using (signed) 8-bit values.
Spatial partitioning methods can be combined with storing the data in the leaves
in a quantized format because the range of the data in a leaf will generally only span
a small part of the entire world. Quantization of localized data saves memory yet
allows a full range of coordinate values. If the leaves are cached after decompression,
the extra cost associated with decompression is generally counteracted by amortiza-
tion. Having leaf contents stored relative to an origin corresponding to the spatial
position of the parent node can help increase robustness because computations are
now performed near that origin, maintaining more precision in the calculations.
13.3.3 Prefetching and Preloading
Because stalls caused by load accesses to uncached memory cause severe slowdowns,
most modern CPUs provide mechanisms for helping avoid these stalls. One such
mechanism is the prefetch instruction, which allows the programmer to direct the
CPU to load a given cache line into cache while the CPU continues to execute as
normal. By issuing a prefetch instruction in advance, the data so loaded will be in
cache by the time the program needs it, allowing the CPU to execute at full speed
without stalling. The risk is that if the prefetch instruction is issued too early then
by the time the CPU is ready for the data it might have been flushed from cache.
Conversely, if it is issued too late then there is little benefit to the prefetch. Ideally,
the prefetch instruction should be issued early enough to account for the memory
latency of a load, but not more.
For linear structures, prefetch instructions are both effective and easy to use. For
example, a simple processing loop such as
// Loop through and process all 4n elements
for(inti=0;i<4*n;i++)
Process(elem[i]);
 
Search WWH ::




Custom Search