Information Technology Reference
In-Depth Information
Therefore, PCA is a linear projection method that maximizes the inertia of
the scatter diagram.
Before describing the theoretical developments, let us review, as a simple
illustration, the example of the distribution of a scatter diagram in
2 shown
in Fig. 3.1. The first main axis found by PCA is the axis with respect to which
the inertia of the scatter diagram is maximal. The second axis, orthogonal to
the previous one, is the axis with respect to which the inertia of the scatter
diagram, in the null space of the first axis. The other axes are defined similarly.
R
PCA and Gram-Schmidt Orthogonalization
This procedure may be reminiscent of the Gram-Schmidt orthogonalization
described in the previous chapter for the selection of inputs. That analogy,
however, is deceptive. PCA is a procedure that is carried out in representation
space , in which each observation is represented by a point, whose co-ordinates
are the values of the factors that correspond to that observation. By contrast,
Gram-Schmidt orthogonalization for the selection of inputs is carried out in
the observation space , where each factor is represented by a vector, the compo-
nents of which are observations of this factor in the database. The dimension
of representation space is the number of factors of the model, whilst the di-
mension of observation space is the number of observations in the database.
Figure 3.2 shows the 2 main axes defined by the 1st and 2nd bisector
respectively (the orthogonality of the axes is distorted by the scale of the
graph). The main components will be represented by projections of points on
the main axes. Linear transformation by PCA therefore consists in changing
the variables, defined by the main axes, on the centered data.
We will show that the “mechanical” concept of total inertia of the scatter
diagram is equivalent to the “statistical” concept of variance. The inertia
of points is computed with respect to the centre of gravity of the scatter
diagram. We denote by g the centre of gravity and by I n
the inertia of the
R n ,wehave
scatter diagram defined in
N
n
N
1
N
g j ) 2 .
g j =
x ij
I n =
( x ij
i =1
j =1
i =1
Inertia I n is therefore equal to the trace of the variance-covariance matrix of
the data X defined by
Ig ) T ( X
V =( X
Ig ) ,
where I denotes the identity matrix.
Since inertia is shift-invariant, the data may be centered by X = X
Ig ,
so that one has the following simple relation between the inertia and the
variance-covariance matrix on the new centered data X :
I n =Trace X T X .
Search WWH ::




Custom Search