Database Reference
In-Depth Information
similar to the PCA, except that the segments need not be of equal length.
Thus regions with great fluctuations may be represented with several short
segments, while reasonably featureless regions may be represented with
fewer, long segments. The main contribution of this representation is that
it is a more effective compression than the PCA, while still representing the
original faithfully.
Two distance measures are developed for the APCA, one which is guar-
anteed to underestimate Euclidean distance, and one which can be cal-
culated more eciently, but which may generate some false dismissals. It
is also shown that this technique, like the PCA, can handle arbitrary
L p
norms. The empirical data suggest that the APCA outperforms both meth-
ods based on the discrete Fourier transform, and methods based on the dis-
crete wavelet transform with a speedup of one to two orders of magnitude.
In a recent paper, Keogh (2002) develops a distance measure that is
a lower bound for dynamic time warping, and uses the PCA approach to
index it. The distance measure is based on the assumption that the allowed
warping is restricted, which is often the case in real applications. Under this
assumption, Keogh constructs two warped versions of the sequence to be
indexed: An upper and a lower limit. The PCA signatures of these limits are
then extracted, and together with Keogh's distance measure form an exact
index (one with no false dismissals) with high precision. Keogh performs
extensive empirical experiments, and his method clearly outperforms any
other existing method for indexing time warping.
3.4. Landmark Methods
In 1997, Keogh and Smyth introduce a probabilistic method for sequence
retrieval, where the features extracted are characteristic parts of the
sequence, so-called feature shapes . Keogh (1997) uses a similar landmark
based technique. Both these methods also use the dimensionality reduction
technique of piecewise linear approximation (see Section 4.2) as a prepro-
cessing step. The methods are based on finding similar landmark features
(or shapes) in the target sequences, ignoring shifting and scaling within
given limits. The technique is shown to be significantly faster than sequen-
tial scanning (about an order of magnitude), which may be accounted for by
the compression of the piecewise linear approximation. One of the contribu-
tions of the method is that it is one of the first that allows some longitudinal
scaling.
A more recent paper by Perng et al . (2000) introduces a more general
landmark model. In its most general form, the model allows any point of
Search WWH ::




Custom Search