Database Reference
In-Depth Information
Problem definition Given a collection of n co-evolving, semi-infinite
streams, producing a value x t,j , for every stream 1
n and for
every time-tick t =1 , 2 ,... , SPIRIT does the following: (i) Adapts the
number k of hidden variables necessary to explain/summarise the main
trends in the collection. (ii) Adapts the participation weights w i,j of the
j -th stream on the i -th hidden variable (1
j
k ), so as
to produce an accurate summary of the stream collection. (iii) Monitors
the hidden variables y t,i ,for1 ≤ i ≤ k . (iv) Keeps updating all the
above eciently.
More precisely, SPIRIT operates on the column-vectors of observed
stream values x t
j
n and 1
i
[ x t, 1 ,...,x t,n ] T and continually updates the par-
ticipation weights w i,j .The participation weight vector w i for the i -
th principal direction is w i := [ w i, 1 ···
w i,n ] T . The hidden variables
[ y t, 1 ,...,y t,k ] T are the projections of x t onto each w i ,overtime
(see Table 5.1 ), i.e.,
y t
y t,i := w i, 1 x t, 1 + w i, 2 x t, 2 +
···
+ w i,n x t,n ,
SPIRIT also adapts the number k of hidden variables necessary to cap-
ture most of the information. The adaptation is performed so that the
approximation achieves a desired mean-square error. In particular, let
x t =[ x t, 1 ···
x t,n ] T be the reconstruction of x t , based on the weights
and hidden variables, defined by
x t,j := w 1 ,j y t, 1 + w 2 ,j y t, 2 +
···
+ w k,j y t,k ,
or more succinctly, x t = i =1 y i,t w i .
Inthechlorineexample, x t is the n -dimensional column-vector of
the original sensor measurements and y t is the hidden variable column-
vector, both at time t . The dimension of y t is 1 before/after the leak
( t< 1500 or t> 3000) and 2 during the leak (1500
t
3000), as
shownin Figure5.1 .
Definition 5.4 (SPIRIT Tracking) SPIRIT updates the participa-
tion weights w i,j so as to guarantee that the reconstruction error
x t
2 over time is predictably small.
x t
This informal definition describes what SPIRIT does. The precise cri-
teria regarding the reconstruction error will be explained later. If we
assume that the x t are drawn according to some distribution that does
not change over time (i.e., under stationarity assumptions), then the
weight vectors w i converge to the principal directions. However, even if
there are non-stationarities in the data (i.e., gradual drift), in practice
we can deal with these very effectively, as we explain later.
Search WWH ::




Custom Search