DIMENSIONALITY REDUCTION AND FILTERING ON TIME SERIES SENSOR STREAMS - Managing and Mining Sensor Data

Database Reference

In-Depth Information

forecasting, and anomaly detection. We aim to provide a unified view

of time series stream mining techniques for dimensionality reduction

(analysis and data reduction across streams) and filtering (analysis and

data reduction across time).

We describe methods that capture correlations and find hidden vari-

ables that describe trends in collections of streams. Discovered trends

can then be used to quickly spot potential anomalies and do ecient

forecasting. We describe a method which can incrementally find these

correlation patterns and hidden variables, which summarize the key

trends in the entire stream collection, with no buffering of stream val-

ues and without directly comparing pairs of streams. Moreover, it is

any-time and dynamically detects changes. We also describe ecient

online methods for quick forecasting (estimation of future values) and

imputation (estimation of past, missing values) on multiple time series

streams. Finally, we describe methods that can capture and summarize

auto-correlations (correlations within a single series, across time), that

also describe key trends. We also briefly explain how these techniques

relate to others, and illustrate various trade-offs that are available to

practitioners.

Keywords: streams, time series, filtering, dimensionality reduction, forecasting

1. Introduction

In this chapter, we consider the problem of capturing correlations

both across multiple streams, as well as across time (auto-correlations).

As we shall see, these two problems are inherently related, and similar

techniques are applicable to both, even though the interpretation of the

results may be different. In the first case, correlations across different

streams allow us to find hidden variables that can summarize collections

of time series data streams. In the second case, auto-correlations sum-

marize patterns across time, that can capture regular or periodic trends

in time series streams.

First we consider the case of correlations across many different streams.

In general, we assume for simplicity that values from all streams are ob-

served together; if that is not the case, then additional pre-processing

or analysis may be necessary. Streams in a large collection are often

inherently correlated (e.g., temperatures in the same building, trac in

the same network, prices in the same market, etc.) and it is possible to

reduce hundreds of numerical streams into just a handful of hidden vari-

ables that compactly describe the key trends and dramatically reduce

the complexity of further data processing. We will present an approach

to do this incrementally.

Search WWH ::

Custom Search

Home