Information Technology Reference
In-Depth Information
approximated by
XD
Y
:
:
generates
approximated by
generates
Figure 1.4
Construction of an asymmetric biplot.
of the methodology is approximating the distance matrix D by a matrix of Pythagorean
distances : n × n . Operationally, this is achieved iteratively by updating r -dimensional
coordinates Y , that generate , to improve the approximation to D . It is hoped that
a small choice of r (hopefully 2) will give a good approximation. Finally, the curved
arrow represents two ideas: (i) in principal component analysis (PCA) Y approximates X;
and (ii) more generally, information on X can be represented in the map of Y (the
essence of biplots). These are the basic steps of multidimensional scaling (see Cox and
Cox, 2001).
In general, the points given by Y generate distances in that approximate the values
in D . In addition, and this is the special contribution of biplots, approximations to the true
values X may be deduced from Y . In the simplest case, the PCA biplot, this approximation
is made by projecting the orthogonal axes of X onto a subspace occupied by Y .Inthe
subsequent chapters, we will discuss more general forms of asymmetric biplots. The most
general of these, appropriately named the generalized biplot, has as special case the PCA
biplot when all variables in X are continuous and the matrix D consists of Pythagorean
distances. When restricting the variables in X to be continuous only, the rows of X
represent the samples as points in p -dimensional space with an associated coordinate
system. In the biplot, we represent the samples as points whose coordinates are given
by the rows of Y and the coordinate system of X by appropriately defined biplot axes.
These axes become nonlinear biplot trajectories when the definition of distance in the
matrix D necessitates a nonlinear transformation from X to Y . The methodology outlined
by Figure 1.4 allows us to also include categorical variables. Even though a categorical
variable cannot be represented in the space of X by a linear coordinate axis, we can
calculate the matrix D and proceed from there.
Thus, a biplot adds to Y information on the variables given in X . In multidimensional
scaling, D may be observed directly and not derived from X , and then biplots cannot
be constructed. The different types of asymmetric biplots discussed above depend on
the properties of the variables in the matrix X and the distance metric producing the
matrix D . Many special cases of importance fall within this general framework and are
illustrated by applications in the following chapters. Several definitions of distance used
in constructing D occur using both quantitative and qualitative variables (or mixtures
of the two). For symmetric biplots, the position is simpler as we have only two main
possibilities: (i) a quantitative variable classified in a two-way table and (ii) a two-way
table of counts.
In Figure 1.5 the biplots to be discussed in the designated chapters are represented
diagrammatically. The distances associated with the matrix D in Figure 1.4 is divided into
subsets for the different types of biplots. The matrix
always consists of Pythagorean
distances to allow intuitive interpretation of the rows of Y .
Search WWH ::




Custom Search