Information Technology Reference
In-Depth Information
3.8 Some novel applications and enhancements
of PCA biplots
3.8.1 Risk management example revisited
In the biplot in Figure 3.1 the point ( CM = 0, IRD = 0, MM = 0, ALCO = 0, SE
= 0, EDSA = 0, EDM = 0) is interpolated by adding the argument X.new.samples
= matrix(rep(0,7), nrow = 1) ) in the call to PCAbipl . This results in the top
panel of Figure 3.29. If the interpolated point is wanted for the scaled data then the
argument scaled.mat = TRUE is also needed. The reader can verify that the result is
a biplot where the interpolated point does not appear. The reason is that the interpolated
point lies outside the original plotting area. To make this point visible, increase the
default setting of exp.factor . The bottom panel of Figure 3.29 was obtained with the
settings scaled.mat = TRUE, exp.factor = 2 . Note the annotation using functions
draw.arrow and draw.text. The interpolated point can be viewed as an ideal point
where there is zero loss for all seven instruments. Clearly the 'best' situation using the
unscaled data was achieved on day20 .
One problem, especially with the biplot in the bottom panel of Figure 3.29, is that
several data points are so bundled together that it is impossible to identify a particular
day. With even moderately large data sets this problem is bound to change for the worse,
therefore we provide an R function, PCAbipl.zoom , to interactively zoom into any
required part of a biplot. When this function is called with the argument zoomval =
x, the window with the drawn PCA biplot is activated, the mouse pointer changes to a
cross and the user can move the cross to select the bottom left-hand corner to zoom into.
The value x controls the amount of zooming such that the aspect ratio is kept constant
at unity. Figure 3.30 gives an example of the zooming function.
The Figure 3.29 biplot can be further enhanced by adding a trend line showing
the seven-dimensional movement over time. A simple solution would be to connect the
sample points in the PCA biplot in temporal order. Since PCAbipl returns the coordinates
of the sample points in the biplot, this is easy to accomplish. The connecting lines are
shown in Figure 3.31 for both the unscaled and the scaled data, but it is obvious that
the trend is difficult to follow, with too many interconnecting lines. Had the data set
consisted of 100 days' VAR values, such a connecting line would render the biplot
useless. It is easy to correct this: we need some form of smoothing. In Figure 3.32 a
nonparametric regression smoother was fitted to each of the two dimensions separately.
The R commands
> z1 <- X.cent %*% Eigenvectors[,1]
# Eigenvectors returned by PCAbipl
> z2 <- X.cent %*% Eigenvectors[,2]
> zfit1 <- fitted(loess(z1
I(1:20)))
> zfit2 <- fitted(loess(z2
I(1:20)))
yield the values to be connected to form this trend line. In this example, the default span
for the loess function was used. However, the amount of smoothing can be controlled
by this parameter and any other smoothing technique can be applied similarly.
Search WWH ::




Custom Search