Information Technology Reference
In-Depth Information
The measures (3.26) and (3.27) will coincide only when the regression fit is exact and
the final residual in (3.25) vanishes. Normally (3.26), unlike (3.27), will not attain unity
even when all p dimensions of the PCA fit are used. In the expressions for (3.26) and
(3.27), X may be replaced by its SVD, in which case we have that
2
JJ 1 U x = x UJU x ,
b JV X XVJb = b JV V
V VJb = x U 1
giving
x UJU x
x x
x =
.
(3.28)
Furthermore,
b V X XVb
2 V Vb
1 U x
x UU x
b V V
x U
1
2
=
=
=
.
so that
x UJU x
x UU x .
2 =
(3.29)
The above expressions generalize in a straightforward way to add several new variables.
It is clear from formulae (2.20), (3.28) and (3.29) that in order to add new axes with
their associated predictivities all we need, in addition to the original SVD, are the values
of all the samples on each of the new variables. There is no need to perform the actual
regression.
As an example of adding new variables we again consider the Ocotea data. Knowl-
edge of the ratios Ve s L to Ve sD and RayH to RayW is of practical importance. These
two ratios ( VLDratio and RHWRatio ) have been added in the form of calibrated biplot
axes to the Figure 3.23 PCA biplot. The augmented biplot is given in Figure 3.24.
The function call for obtaining the biplot in the bottom panel of Figure 3.24 is
> VLDratio <- Ocotea.data[,4]/Ocotea.data[,3]
> RHWratio <-Ocotea.data[,6]/Ocotea.data[,7]
> Ocotea.data.newvars <- data.frame(Ocotea.data, VLDratio =
VLDratio, RHWratio = RHWratio)
> PCAbipl(Ocotea.data[,3:8], scaled.mat = TRUE,
X.new.vars = as.matrix(Ocotea.data.newvars[,9:10]),
colours = "green", pch.samples = 15, pch.samples.size = 1.25,
label = FALSE, pos = "Hor", offset = c(-0.2, 0.1, 0.1, 0.2),
n.int = c(5,5,5,5,3,5,10,5),
ax.col = list(ax.col = c(rep("grey",6),"red","red"),
tickmarker.col = c(rep("grey",6),"red","red"),
marker.col = c(rep("grey",6),"red","red")),
ax.name.col = c(rep("black",6), "red","red"),
pch.new = 16, pch.new.cols = c("red","blue","cyan"),
pch.new.labels = c("O.bul","O.ken","O.por"),
predictions.sample = c(10,35))
It follows from Table 3.16 that neither of the two added variables has high axis predictiv-
ity in two dimensions. Although this is particularly true of VLDratio , there is a dramatic
increase in its axis predictivity when a third dimension is added.
Search WWH ::




Custom Search