Database Reference
In-Depth Information
[ x t, 1 ...,x t,n ] T ,
t =1 , 2 ,... , there exist correlations between the n dimensions (which
correspond to streams in our setting). These can be captured by prin-
cipal components analysis (PCA). Consider for example the setting in
Figure 5.3 . There is a visible linear correlation. Thus, if we represent
every point with its projection on the direction of w 1 , the error of this
approximation is very small. In fact, the first principal direction w 1 ,is
the optimal in the following sense.
Typically, in collections of n -dimensional points x t
Definition 5.1 (First principal component) Given a collection of
n -dimensional vectors x τ
n , τ =1 , 2 ,...,t ,the first principal direc-
R
n is the vector minimizing the sum of squared residuals,
tion w 1
R
i.e.,
t
τ =1 x τ ( ww T ) x τ
2 .
w 1 := arg min
w =1
The projection of x τ on w 1 is the first principal component (PC) y τ, 1 :=
w 1 x τ , τ =1 ,...,t .
Note that, since
=1,wehave( w 1 w 1 ) x τ =( w 1 x τ ) w 1 = y τ, 1 w 1 =:
x τ ,where x τ is the projection of y τ, 1 back into the original n -D space.
That is, x τ is the reconstruction of the original measurements from the
first PC y τ, 1 . More generally, PCA will produce k vectors w 1 , w 2 ,..., w k
such that, if we represent each n -D data point x t := [ x t, 1 ···
w 1
x t,n ]with
its k -D projection y t := [ w 1 x t ···
w k x t ] T , then this representation min-
imises the squared error τ
2 . Furthermore, the principal direc-
tions are orthogonal, so the principal components y τ,i , 1
x t
x t
k are by
construction uncorrelated , i.e., if y ( i ) := [ y 1 ,i ,...,y t,i ,... ] T is the stream
of the i -th principal component, then y ( i ) T y ( j ) =0if i = j .
i
Observation 3.1 (Dimensionality reduction) If we represent each
n -dimensional point x τ R
n using all n principal components, then the
error
=0 . However, in typical datasets, we can achieve a very
small error using only k principal components, where k
x τ
x τ
n .
In the context of the chlorine example, each point in Figure 5.3 would
correspond to the 2-D projection of x τ (where 1 ≤ τ ≤ t )ontothe
first two principal directions, w 1 and w 2 , which are the most impor-
tant according to the distribution of { x τ | 1 ≤ τ ≤ t} . The principal
components y τ, 1 and y τ, 2 are the coordinates of these projections in the
orthogonal coordinate system defined by w 1 and w 2 .
However, batch methods for estimating the principal components re-
quire time that depends on the duration t , which grows to infinity. In
fact, the principal directions are the eigenvectors of X t X t ,whichare
Search WWH ::




Custom Search