Graphics Reference
In-Depth Information
Even though a full optimization step is infeasible, once a partitioning axis has
been selected a small number of hill-climbing steps can be performed to improve on
the axis. One approach involves perturbing the direction of the axis, replacing the
selected axis if the perturbed axis performs better, and repeating this for as many
steps as can be afforded.
An interesting, but largely unexplored, option is to apply other statistical methods
than principal component analysis to the problem of finding partitioning axes. These
include the related methods of projection pursuit , independent component analysis , and
blind signal separation , which are all techniques for recovering unobserved individual
components from an observed mixed source. For instance, the method of projection
pursuit as a simplification can be described as a way of obtaining a direction for which
the entropy (rather than the variance) of the projected data is maximized. As entropy
is a measure of information or “interestingness,” and data clusters will have high
entropy, this direction forms a good candidate axis for partitioning the data in two or
more clusters. For an introduction to projection pursuit and independent component
analysis see [Stone98]. For blind signal separation see [Cardoso98].
6.2.1.3 Choice of Split Point
The infeasibility of optimizing over all possible axes applies to the choice of split point
as well. As there are infinitely many splitting points along the axis, again the selection
must be restricted to a small set of choices, such as:
Median of the centroid coordinates (object median). Splitting at the object median
evenly distributes the primitives between the subsets, resulting in a balanced
tree. The median is trivially found in O ( n log n ) time by sorting the points, or in
O ( n ) time using a more sophisticated method (see [Cormen90] for details).
Mean of the centroid coordinates (object mean). Well-balanced trees do not nec-
essarily give the best query times. Splitting along the local axis with largest
variance, [Klosowski98] report that using the object mean outperforms using
the object median. They report splitting at the mean consistently gives smaller
volume trees, with a lower number of operations performed and better query
times as a result. The object mean is found in O ( n ) time.
Median of the bounding volume projection extents (spatial median). Splitting at the
spatial median (thus splitting the volume in two equal parts) is an attractive
option, as the split point is found in constant time from examining just the
bounding volume and not its contained data. This alternative is often used
when the axis is selected from the parent volume (for example, using the
longest-side rule).
Splitting at k evenly spaced points along the bounding volume projection extents.
Instead of spending time on what amounts to “intelligently guessing” a good
Search WWH ::




Custom Search