Database Reference
In-Depth Information
forecasting, and anomaly detection. We aim to provide a unified view
of time series stream mining techniques for dimensionality reduction
(analysis and data reduction across streams) and filtering (analysis and
data reduction across time).
We describe methods that capture correlations and find hidden vari-
ables that describe trends in collections of streams. Discovered trends
can then be used to quickly spot potential anomalies and do ecient
forecasting. We describe a method which can incrementally find these
correlation patterns and hidden variables, which summarize the key
trends in the entire stream collection, with no buffering of stream val-
ues and without directly comparing pairs of streams. Moreover, it is
any-time and dynamically detects changes. We also describe ecient
online methods for quick forecasting (estimation of future values) and
imputation (estimation of past, missing values) on multiple time series
streams. Finally, we describe methods that can capture and summarize
auto-correlations (correlations within a single series, across time), that
also describe key trends. We also briefly explain how these techniques
relate to others, and illustrate various trade-offs that are available to
practitioners.
Keywords: streams, time series, filtering, dimensionality reduction, forecasting
1. Introduction
In this chapter, we consider the problem of capturing correlations
both across multiple streams, as well as across time (auto-correlations).
As we shall see, these two problems are inherently related, and similar
techniques are applicable to both, even though the interpretation of the
results may be different. In the first case, correlations across different
streams allow us to find hidden variables that can summarize collections
of time series data streams. In the second case, auto-correlations sum-
marize patterns across time, that can capture regular or periodic trends
in time series streams.
First we consider the case of correlations across many different streams.
In general, we assume for simplicity that values from all streams are ob-
served together; if that is not the case, then additional pre-processing
or analysis may be necessary. Streams in a large collection are often
inherently correlated (e.g., temperatures in the same building, trac in
the same network, prices in the same market, etc.) and it is possible to
reduce hundreds of numerical streams into just a handful of hidden vari-
ables that compactly describe the key trends and dramatically reduce
the complexity of further data processing. We will present an approach
to do this incrementally.
Search WWH ::




Custom Search