Database Reference
In-Depth Information
2.1. Robust Distance Measures
The choice of distance measure is highly domain dependent, and in some
cases a simple
L p norm such as Euclidean distance may be sucient. How-
ever, in many cases, this may be too brittle [Keogh and Pazzani (1999b)]
since it does not tolerate such transformations as scaling, warping, or trans-
lation along either axis. Many of the newer retrieval methods focus on using
more robust distance measures, which are invariant under such transforma-
tions as time warping (see Appendix for details) without loss of perfor-
mance.
2.2. Good Indexing Methods
Faloutsos et al. (1994) list the following desirable properties for an indexing
method:
(i) It should be faster than a sequential scan.
(ii) It should incur little space overhead.
(iii) It should allow queries of various length.
(iv) It should allow insertions and deletions without rebuilding the index.
(v) It should be correct: No false dismissals must occur.
To achieve high performance, the number of false alarms should also be
low. Keogh et al. (2001b) add the following criteria to the list above:
(vi) It should be possible to build the index in reasonable time.
(vii) The index should preferably be able to handle more than one distance
measure.
2.3. Spatial Indices and the Dimensionality Curse
The general problem of similarity based retrieval is well known in the field of
information retrieval, and many indexing methods exist to process queries
eciently [Baeza-Yates and Ribeiro-Neto (1999)]. However, certain prop-
erties of time sequences make the standard methods unsuitable. The fact
that the value ranges of the sequences usually are continuous, and that
the elements may not be equi-spaced, makes it dicult to use standard
text-indexing techniques such as sux-trees. One of the most promising
techniques is multidimensional indexing (
-trees [Guttman (1984)], for
instance), in which the objects in question are multidimensional vectors,
and similar objects can be retrieved in sublinear time. One requirement of
such spatial access methods is that the distance measure must be monotonic
R
Search WWH ::




Custom Search