Information Technology Reference
In-Depth Information
for the i th variable and denote it by i . Axis predictivity is 'variance accounted for'
per variable, and its interpretation depends critically on the orthogonal ANOVA and the
independence result for the diagonals of X X . It is this property that justifies expressing
the overall quality as a weighted sum of the
i with weights
w i = ( V V ) ii
tr ( )
given by the variances of the i th variable relative to total variance. Since
JV ) ii
( V V ) ii
= (
V
i
V ) ii
JV ) ii and summing gives
it follows that
i (
V
= (
V
p
V ) ii
JV ) =
1 i (
V
=
tr
(
V
tr
(
J
)
i
=
which, after division by tr ( ) gives
p
p
( V V ) ii
tr ( )
( J )
tr ( )
tr
quality =
=
i
=
w i i .
(3.21)
i = 1
i = 1
While axis predictivity is concerned with the quality of the representation of each
variable, sample predictivity provides the quality of representation of samples. This mea-
sures how far is each sample from its r -dimensional approximation; a value of unity
implies that the sample is in the plane of approximation and a value of zero that it is
orthogonal to the plane. A word of warning: sample predictivity combines measures on
variables that may not be commensurable, thus highlighting the problems discussed in
Section 2.5.
Revisiting our example earlier, both the axis predictivities and sample predictivities
of the artificial two-dimensional data set in six dimensions all equal unity. Turning to the
complete set of aircraft data, a PCA achieved an overall two-dimensional quality measure
of 0.9677. Table 3.6 gives the two-dimensional adequacies and predictivities. Variable
SPR is the most adequately approximated and PLF the least adequate. In this case, the
predictivities show a similar ranking. However, as we saw above, the predictivities give
an immediate estimate of the success in predicting the values taken by X for its associated
variables, while adequacy is more concerned with the dispositions of coordinate axes and
any related distortions of the scale of each axis.
The axis predictivities are summarized in Figure 3.22, which plots the predictivities
for each variable against the number of dimensions fitted. We see that SPR is nearly
perfectly predicted in one dimension and that all variables except PLF require only
Ta b l e 3 . 6 Adequacies and axis predictivities for the two-dimensional
PCA biplot of the aircraft data in Figure 2.5.
SPR
RGF
PLF
SLF
Adequacy
0.999
0.205
0.002
0.794
Predictivity
1.000
0.571
0.245
0.957
Search WWH ::




Custom Search