Monoplots - Understanding Biplots

Information Technology Reference

In-Depth Information

approximate the observed values d ii presented in a symmetric proximity matrix D .

Crucially, there is no requirement that the entries of D be derived from a data matrix X ;

they may be directly observed in some experimental situation. Then, a monoplot of Z is

the only possibility. We denote the distances generated by the rows of Z by .

Different methods of multidimensional scaling depend on the criterion specified for

measuring the discrepancy between observed, D , and fitted distances, , and the numeri-

cal algorithms used for optimizing the chosen criterion. These criteria may be divided into

two classes - metric multidimensional scaling and nonmetric multidimensional scaling.

The simplest metric multidimensional scaling method is classical scaling/principal coor-

dinates analysis (PCO), discussed in detail in Chapter 5. PCO requires that there exists

amatrix Y in (at most) n

1 dimensions whose rows generate the observed distances

exactly (the Euclidean embeddable property); in PCA, Y coincides with X . Then, Z of

rank r is obtained by minimizing || Y − Z ||

−

2 which, as we have seen, is found from the

Eckart-Young theorem on the least-squares properties of the SVD of Y . As explained in

Chapter 5, the two steps of finding Y and then Z may be subsumed into a single simple

eigenvalue decomposition. In this method, plays no overt part.

When D is directly observed, the distances are unlikely to be generated by a real

matrix Y . Then, strictly speaking, PCO is unavailable, though it may give acceptable

results when there are only a few unimportant negative eigenvalues in the eigendecom-

position. Other forms of metric scaling do not require the existence of Y but do depend

overtly on . The two most popular are least-squares scaling which minimizes a criterion

termed stress ,definedby

i ( d ii − δ ii )

;

and least-squares squared scaling which minimizes a criterion sstress ,definedby

i ( d ii − δ

ii )

These criteria have to be minimized by available iterative methods, such as ALSCAL,

KYST,orSMACOF,

(see Cox and Cox, 2001; Borg and Groenen, 2005).

Figure 10.1 gives a geometric interpretation of the stress criterion. We plot the points

( d ii ,

...

for all ( ii )

δ ii )

pairs. With a perfect fit, these points would lie on the line d

= δ

In other cases, the projection of ( d ii , δ ii ) onto the line d = δ is

( d ii + δ ii , d ii + δ ii ) ,

giving a residual sum of squares equal to half the stress criterion. Note that stress =

2 . A similar representation where all distances are replaced by squared dis-

tances gives the sstress criterion. The representation of Figure 10.1 is not particularly

interesting in itself, but it contributes to a better understanding of the corresponding

nonmetric criterion, to be described next.

Nonmetric multidimensional scaling depends only on the ordinal information in D .

Then, it is required only that the fitted values

|| D − ||

approximate the same ordering as those

in D . How this is done is indicated in Figure 10.2, which is similar to Figure 10.1 except

that the linear 'regression' is replaced by a monotone regression function, as shown.

This monotone regression function is obtained by partitioning the points into blocks that

are above and to the right of all preceding blocks. The fitted values ˆ

δ ii

are determined

Understanding Biplots

Search WWH ::

Custom Search

Home