Graphics Reference
In-Depth Information
Figure
.
.
[his figure also appears in the color insert.] PCA layout of digits dataset (top panel)and
the
-D graph layout (bottom panel)
of all the variables. hese are two instances where the resulting graph representation
of the data gives rise to a bipartite graph. A slight modification of the Q
(ċ)
objective
Z
′
Y
′
function leads to interesting graph layouts of such data sets. Let X
,where
Z contains thecoordinates ofthefirstsubsetofthevertices and Y thoseofthesecond
subset.heobjective function forsquared Euclidean distances can then bewritten as
(given the special block structure of the adjacency matrix A)
=[
]
′
′
′
Q
Z, Y
A
trace
Z
D
Z
Z
Y
D
Y
Y
Y
AZ
(
.
)
(
)=
(
+
−
)
where D
Y
is a diagonal matrix containing the column sums of A and D
Z
another
diagonal matrix containing the row sums of A. In the case of a contingency table,
both D
Y
and D
Z
contain the marginal frequencies of the two variables, while for
a multivariate categorical data set D
Y
contains again the univariate marginals of all
the categories of all the variables and D
Z
JI is a constant multiple of the identity
matrix, with J denoting the number of variables in the data set. A modification of
=