Information Technology Reference
In-Depth Information
However, when D has been derived from a data matrix X , we may wish to superimpose
axes, possibly calibrated, to upgrade these monoplots to biplots. These axes approximate
the variables, whose values are assumed to be given in the columns, of X . Chapter 5
has shown how nonlinear trajectories may represent variables in PCO. The regression
method (see Sections 2.7, 3.5, 4.5, 5.2 and 7.3) may be used to superimpose linear axes
on PCO plots and with plots derived from other methods of multidimensional scaling.
10.2 Monoplots related to the covariance matrix
10.2.1 Covariance plots
Falling into the class of monoplots where one of the entities is omitted by choice is the
frequent presentation of maps of covariance and correlation matrices. These are closely
related to PCA but now with interest focused on the approximation of X X rather than of
X itself. With the SVD X = U V , in Chapter 3, PCA was based on plotting the rows
of U J to represent the samples and the rows of VJ to give directions for the variables.
In monoplots we plot the rows of V J . The inner product
( V J )( V J ) = V
2 JV
) 1 X X . Now the inner
product is found from pairs of points of the same kind, both representing the variables.
The inner product of UJ and V
approximates X X or, equivalently, the covariance matrix
( n
1
J is unaltered so the approximation of X would
remain valid and easily evaluated if UJ were added to the monoplot and the rows of
V
J suitably calibrated, so giving a biplot. In this representation the distances between
pairs of row points do not approximate the distances between the rows of X ,aswas
available with the PCA representations of Chapter 3. As a compromise, one may plot
U J for the row points and V J (see CA: Chapter 7) for the variables, so producing a
joint plot with no interesting inner product. It would be possible, but impracticable, to
add calibrated axes based on UJ and VJ as well.
As an example we consider Flotation.data . This data set forms part of a larger
data set collected at a base metal flotation plant. The complete data set consists of
three sets of variables: digitized froth variables, operational variables and process quality
variables. Six of the operational variables were chosen for Flotation.data , namely
RMOF , RMP , BMSWA , BMCFF , BMCFD and EXFFBB . It is clear from Table 10.1
that the standard deviations of these variables differ to a great extent. Furthermore,
they can be considered to be measured on a ratio scale. In Figure 10.3 we give for
comparative purposes the ordinary biplot of the flotation data centred but unscaled in
the top panel but scaled to unit variances in the bottom panel. Figures 10.4 and 10.5
Table 10.1
Covariance matrix of the flotation data.
RMOF
RMP
BMSWA
BMCFF
BMCFD
EXFFBB
RMOF
2 695
.
5160
7
.
3482
359
.
0530
1987
.
4986
0
.
2112
48
.
4237
RMP
7
.
3482
0
.
0352
1
.
1955
6
.
9332
0
.
0027
0
.
1348
BMSWA
359
.
0530
1
.
1955
157
.
0530
400
.
2840
0
.
2119
8
.
2891
BMCFF
1987
.
4986
6
.
9332
400
.
2840
1949
.
6857
0
.
4801
39
.
0583
BMCFD
0 . 2112
0 . 0027
0 . 2119
0 . 4801
0 . 0029
0 . 0018
EXFFBB
48 . 4237
0 . 1348
8 . 2891
39 . 0583
0 . 0018
1 . 0498
Search WWH ::




Custom Search