Information Technology Reference
In-Depth Information
we plot the coordinates given by the n rows of U J to give the r -dimensional scatterplot
and the p rows of VJ to give the directions of the axes. How to calibrate these axes is
discussed below.
Note that VJV is the required projection matrix. Thus, the rows of X , VJV give the
projections of X onto the r -dimensional hyperplane relative to the original p orthogonal
axes, while the rows of XVJ give the projections of the same points relative to r orthog-
onal axes in the hyperplane, the remaining p - r dimensions being zero. In particular, the
unit points on the original p axes project to IVJV and IVJ , respectively, showing that
the first r columns of VJ give the direction cosines in r dimensions of the best-fitting
plane. The columns of V are often termed the principal components, or the principal
component loadings, and XV interpreted as new latent variables. Because V X XV
=
2
= , which is diagonal, these latent variables are uncorrelated and in the literature
much is made of the possibilities of their interpretation. Here, we almost entirely ignore
this aspect, which has its roots in factor analysis, and concentrate on representing and
interpreting the original variables themselves. For us, the main interest in the uncorrelated
property is that it gives us orthogonal axes in r dimensions that simplify the plotting of
visualizations, but which need not themselves be shown. One important property is that
2
X [ r ] ||
2
X [ r ] ||
2 ,
|| X ||
= |
+|| X
(3.2)
showing that the total sum of squares is the sum of the fitted and residual sums of
squares. This underpins the proper use of measures such as the 'variance accounted for'
described more fully in Section 3.3. Also, it makes it clear that minimizing the sum of
squares of the residuals is the same as maximizing the variation in the fitted plane, as
stated in Jolliffe's definition of PCA.
3.2.1 Representation of sample points
Geometrically, we have seen that the Eckart-Young approximation orthogonally projects
the points in X onto the best two (in general, r ) dimensions for visualization.
Consider the three-dimensional data set given in Table 3.3 and represented graphically
in Figure 3.3. To represent Figure 3.3 optimally in two dimensions, a plane, passing
through the centroid, must be found that minimizes the sum of squares of the distances
from the plane. We call this plane the biplot plane and denote it by L . The plane L is
shown in Figure 3.4.
Minimizing squared distances from original points amounts to orthogonal projections
onto the plane. This is illustrated in Figure 3.5, where the first two principal components
(the columns of V r ) span the two-dimensional biplot space and the sum of the squares
of the distances represented by the arrows is minimized.
In the PCA biplot, interpolation is achieved by orthogonal projection of each sample
point onto the biplot space, and because V is orthogonal
x proj
x V r V r
=
(3.3)
is the representation of sample x , projected onto in terms of the three Cartesian axes.
The projection is illustrated in Figure 3.5 and the interpolated sample points are shown
in Figure 3.6.
Search WWH ::




Custom Search