DIMENSIONALITY REDUCTION AND FILTERING ON TIME SERIES SENSOR STREAMS - Managing and Mining Sensor Data

Database Reference

In-Depth Information

5. MUSCLES

MUSCLES (MUlti-SequenCe LEast Squares) [58] tries to predict the

value of one stream, x t,i based on the previous values from all streams,

x t−l,j , l> 1, 1

≤

n and current values from other streams, x t,j ,

= i . It uses multivariate autoregression, thus the prediction x t,i for a

given stream i is, similar to Eq. 5.2

x t,i = φ 1 , 0 x t, 1 + φ 1 , 1 x t− 1 , 1 + ... + φ 1 ,W x t−W, 1 +

... +

φ i− 1 , 0 x t− 1 ,i− 1 + φ i− 1 , 1 x t− 1 ,i− 1 + ... + φ i− 1 ,w x t−W,i− 1 +

φ i, 1 x t− 1 ,i + ... + φ i,w x t−W,i +

φi +1 , 0 x t,i +1 + φ i +1 , 1 x t− 1 ,i +1 + ... + φ i +1 ,w x t−W,i +1 +

... +

φ n, 0 x t,n + φ n, 1 x t− 1 ,n + ... + φ n,W x t−W,n + t .

and employs RLS to continuously update the coecients φ i,j such that

the prediction error

x τ,i ) 2

( x τ,i −

τ =1

is minimized. Note that the above equation has one dependent variable

(the estimate x t,i )and v = W

1 independent variables (the past

values of all streams plus the current values of all other streams except

i ).

∗

n + n

−

Exponentially forgetting MUSCLES employs a forgetting factor 0 <

≤

1 and minimizes instead

λ t−τ ( x τ,i − x τ,i ) 2 .

τ =1

For λ< 1, errors for old values are down-weighted by an exponential fac-

tor, hence permitting the estimate to adapt as sequence characteristics

change.

5.1 Selective MUSCLES

In case we have too many time sequences (e.g., n = 100 , 000 nodes

in a network, producing information about their load every minute),

even the incremental version of MUSCLES will suffer. The solution

to this problem is based on the conjecture that we do not really need

information from every sequence to make a good estimation of a missing

value. Much of the benefit of using multiple sequences may be captured

Managing and Mining Sensor Data

Search WWH ::

Custom Search

Home