Database Reference
In-Depth Information
the triangle inequality. Note that this method does not achieve results com-
parable to those of Keogh (2002).
4. Other Approaches
Not all recent methods rely on spatial access methods. This section contains
a sampling of other approaches.
4.1. Using Sux Trees to Avoid Redundant Computation
Baeza-Yates and Gonnet (1999) and Park et al . (2000) independently intro-
duce the idea of using sux trees [Gusfield (1997)] to avoid duplicate cal-
culations when using dynamic programming to compare a query sequence
with other sequences in a database. Baeza-Yates and Gonnet use edit dis-
tance (see Appendix for details), while Park et al . use time warping.
The basic idea of the approach is as follows: When comparing two
sequences
x
and
y
with dynamic programming, a subtask will be to compare
y 1: j . If two other sequences are compared that have
identical prefixes to these (for instance, the query and another sequence
from the database), the same calculations will have to be performed again.
If a sequential search for subsequence matches is performed, the cost may
easily become prohibitive.
To avoid this, all the sequences in the database are indexed with a sux
tree. A sux tree stores all the suxes of a sequence, with identical pre-
fixes stored only once. By performing a depth-first traversal of the sux
tree one can access every sux (which is equivalent to each possible subse-
quence position) and backtrack to reuse the calculations that have already
been performed for the prefix that the current and the next candidate sub-
sequence share.
Baeza-Yates and Gonnet assume that the sequences are strings over
a finite alphabet; Park et al. avoid this assumption by classifying each
sequence element into one of a finite set of categories. Both methods achieve
subquadratic running times.
x 1: i and
their prefixes
4.2. Data Reduction through Piecewise Linear
Approximation
Keogh et al. have introduced a dimensionality reduction technique using
piecewise linear approximation of the original sequence data [Keogh (1997),
Keogh and Pazzani (1998), Keogh and Pazzani (1999a), Keogh and Pazzani
(1999b), Keogh and Smyth (1997)]. This reduces the number of data
Search WWH ::




Custom Search