Database Reference
In-Depth Information
1. The regions that are stored in the index are not fixed, but generated
by the Add and Grow operations. Due to the streaming nature of the
application, past decisions about growing and adding affect the set of
regions stored in the index and hence future performance.
2. Traditionally, the performance of index structures has been measured
in terms of the cost of search and in some cases update. Since the goal
of function approximation is to minimize total simulation cost, func-
tion evaluations and region operations must also be accounted for when
evaluating the performance of an index.
In the light of these observations a principled analysis of the various costs
in the function approximation algorithm leads to the discovery of novel trade-
offs. These tradeoffs produce significant and varying effects on different in-
dex structures. The remainder of this section briefly introduces the tradeoffs
and the tuning parameters that have been proposed to exploit them. The in-
dexing problem is studied here using the concrete instantiation of the ISAT
Algorithm using ellipsoidal regions with linear models. Therefore, regions are
often referred to as ellipsoids in the rest of the section. It is important to note,
however, that the ideas are applicable to other types of regions and associated
functions.
Tuning Retrieves. In most high-dimensional index structures the ellipsoid
containing a query point is usually not the first ellipsoid examined since an
index structure often performs fast approximations during the search before
performing an expensive check at each leaf level. The index ends up looking
at a number of ellipsoids before finding “the right one.” The additional ellip-
soids that are examined by the index are called false positives . For each false
positive, the algorithm pays to search and retrieve the ellipsoid from the index
and to check if the ellipsoid contains the query point. In traditional indexing
problems, if an object that satisfies the query condition exists in the index,
then finding this object during search is mandatory. Therefore, the number of
false positives is a fixed property of the index.
However, the function approximation problem provides the flexibility to
tune the number of false positives, because we can fall back to evaluating
the expensive function if the index search was not successful. The number of
false positives can be tuned by limiting the number of ellipsoids examined
during the retrieve step. This parameter is denoted by Ellr. Ellr places an
upper bound on the number of false positives for a query. Tuning Ellr controls
several effects.
Effect 1: Decreasing Ellr reduces the cost of the retrieve operation as
fewer ellipsoids are retrieved and examined.
Effect 2: Decreasing Ellr decreases the chance of finding an ellip-
soid containing the query point, thereby resulting in more function
evaluations.
Search WWH ::




Custom Search