Database Reference
In-Depth Information
CHAPTER 4
INDEXING TIME-SERIES UNDER CONDITIONS OF NOISE
Michail Vlachos and Dimitrios Gunopulos
Department of Computer Science and Engineering
Bourns College of Engineering
University of California, Riverside Riverside, CA 92521, USA
E-mail: { mvlachos, dg } @cs.ucr.edu
Gautam Das
Microsoft Research, One Microsoft Way, Redmond, WA 98052, USA
E-mail: gautamd@microsoft.com
We present techniques for the analysis and retrieval of time-series under
conditions of noise. This is an important topic because the data obtained
using various sensors (examples include GPS data or video tracking data)
are typically noisy. The performance of previously used measures is gen-
erally degraded under noisy conditions. Here we formalize non-metric
similarity functions based on the Longest Common Subsequence that
are very robust to noise. Furthermore they provide an intuitive notion of
similarity between time-series by giving more weight to the similar por-
tions of the sequences. Stretching of sequences in time is allowed, as well
as global translating of the sequences in space. Ecient approximate
algorithms that compute these similarity measures are also provided.
We compare these new methods to the widely used Euclidean and Time
Warping distance functions (for real and synthetic data) and show the
superiority of our approach, especially under the strong presence of noise.
We prove a weaker version of the triangle inequality and employ it in
an indexing structure to Answer nearest neighbor queries. Finally, we
present experimental results that validate the accuracy and eciency of
our approach.
Keywords : Longest Common Subsequence; time-series; spatio-temporal;
outliers; time warping.
67
Search WWH ::




Custom Search