Multidimensional scaling and nonlinear biplots - Understanding Biplots

Information Technology Reference

In-Depth Information

The biplot in the bottom panel of Figure 5.33 results from the following function

call:

PCAbipl(X = means.mat.scaled, G = indmat(1:3),

pch.samples = rep(15,3), pch.samples.size = 1.25,

pch.new.labels.size = 0.6, pch.new.size = 1,

colours = UBcolours[(1:3)+12], X.new.samples = data,

pch.new = 16, pch.new.labels = 1:37,

pch.new.cols = rep(UBcolours[1:3],c(20,7,15)), exp.factor = 2,

markers = FALSE, pos = "Hor", offset = c(0.02, 0.3, 0.1, 0.1),

offset.m = rep(-0.1, 6), n.int = rep(2,6))

First we compare this representation with that given in Figure 4.1. The top panel

is concerned with unnormalized data which, as we have seen, can have a profound

effect on PCA, as is verified by comparing with the bottom panel showing an

unstructured but normalized PCA. The normalization may be regarded as a first

step towards removing the incommensurabilities that are fully handled by CVA

itself (Figure 4.2). In Figure 5.33 (bottom panel) the three means are represented

exactly in two dimensions, as in CVA, roughly at the vertices of an equilateral

triangle and, as might be expected, with a different orientation than in Figure 4.1

(bottom panel). The between-group dispersion is much clearer in Figure 5.33 and

has much less overlap than in Figure 4.1 (bottom panel). The grouping given by

CVA in Figure 4.2 is even clearer but looks remarkably similar to Figure 5.33. Apart

from Numves , the corresponding biplot axes in Figure 4.2 and the bottom panel of

Figure 5.33 are almost identical. The PCA of the group means with added within-group

dispersion has worked very well and might be considered as a model for further

development.

We could proceed as in the above PCA example by doing a PCO of D .Thenwe

would evaluate the group means to produce a map of the K group means by using

PCA. Finally, we would rotate all n samples so that the group means occupy the first

K − 1 dimensions and show the within-group samples in this space. A problem with

this approach is that n may be very large, entailing a massive eigendecomposition. This

problem can be avoided by using the methodology described below, which requires the

whole of D but the eigenstructure of only a K × K matrix.

The method followed in constructing the biplots in Figure 5.33 may be generalized

to an analysis of distance (AoD) where the ddistances between all pairs of samples are

available in the form of an n × n matrix D ={−

1

2 d ij }

. Distances may be defined very

generally, though it is desirable that they be Euclidean embeddable as we assume here.

In addition, as above, we have grouping information available in G (see Section 4.2)

with associated partitioning of D conveniently written as

D 11

D 12

...

D 1 K

D 21

D 22

...

D 2 K

,

.

. . .

...

D K 1

D K 2

D KK

though there is no requirement in the following that the n samples be presented in the

implied group-by-group order.

Understanding Biplots

Search WWH ::

Custom Search

Home