A SURVEY OF RECENT METHODS FOR EFFICIENT RETRIEVAL OF SIMILAR TIME SEQUENCES - Data Mining in Time Series Databases

Database Reference

In-Depth Information

the triangle inequality. Note that this method does not achieve results com-

parable to those of Keogh (2002).

4. Other Approaches

Not all recent methods rely on spatial access methods. This section contains

a sampling of other approaches.

4.1. Using Sux Trees to Avoid Redundant Computation

Baeza-Yates and Gonnet (1999) and Park et al . (2000) independently intro-

duce the idea of using sux trees [Gusfield (1997)] to avoid duplicate cal-

culations when using dynamic programming to compare a query sequence

with other sequences in a database. Baeza-Yates and Gonnet use edit dis-

tance (see Appendix for details), while Park et al . use time warping.

The basic idea of the approach is as follows: When comparing two

sequences

x

and

y

with dynamic programming, a subtask will be to compare

y 1: j . If two other sequences are compared that have

identical prefixes to these (for instance, the query and another sequence

from the database), the same calculations will have to be performed again.

If a sequential search for subsequence matches is performed, the cost may

easily become prohibitive.

To avoid this, all the sequences in the database are indexed with a sux

tree. A sux tree stores all the suxes of a sequence, with identical pre-

fixes stored only once. By performing a depth-first traversal of the sux

tree one can access every sux (which is equivalent to each possible subse-

quence position) and backtrack to reuse the calculations that have already

been performed for the prefix that the current and the next candidate sub-

sequence share.

Baeza-Yates and Gonnet assume that the sequences are strings over

a finite alphabet; Park et al. avoid this assumption by classifying each

sequence element into one of a finite set of categories. Both methods achieve

subquadratic running times.

x 1: i and

their prefixes

4.2. Data Reduction through Piecewise Linear

Approximation

Keogh et al. have introduced a dimensionality reduction technique using

piecewise linear approximation of the original sequence data [Keogh (1997),

Keogh and Pazzani (1998), Keogh and Pazzani (1999a), Keogh and Pazzani

(1999b), Keogh and Smyth (1997)]. This reduces the number of data

Search WWH ::

Custom Search

Home