Database Reference
In-Depth Information
CHAPTER 1
SEGMENTING TIME SERIES: A SURVEY AND
NOVEL APPROACH
Eamonn Keogh
Computer Science & Engineering Department, University of California —
Riverside, Riverside, California 92521, USA
E-mail: eamonn@cs.ucr.edu
Selina Chu, David Hart, and Michael Pazzani
Department of Information and Computer Science, University of California,
Irvine, California 92697, USA
E-mail:
{
selina, dhart, pazzani
}
@ics.uci.edu
In recent years, there has been an explosion of interest in mining time
series databases. As with most computer science problems, representa-
tion of the data is the key to ecient and effective solutions. One of the
most commonly used representations is piecewise linear approximation.
This representation has been used by various researchers to support clus-
tering, classification, indexing and association rule mining of time series
data. A variety of algorithms have been proposed to obtain this represen-
tation, with several algorithms having been independently rediscovered
several times. In this chapter, we undertake the first extensive review
and empirical comparison of all proposed techniques. We show that all
these algorithms have fatal flaws from a data mining perspective. We
introduce a novel algorithm that we empirically show to be superior to
all others in the literature.
Keywords : Time series; data mining; piecewise linear approximation;
segmentation; regression.
1. Introduction
In recent years, there has been an explosion of interest in mining time
series databases. As with most computer science problems, representation
of the data is the key to ecient and effective solutions. Several high level
1
Search WWH ::




Custom Search