Graphics Reference
In-Depth Information
correlation matrices using eigenvalue decomposition. Given a p-dimensional prox-
imity matrix D, a sequence of correlation matrices R
R () , R () ,
is iteratively
formed from it. Here R () is the correlation matrix of the original proximity matrix
D,andR ( n ) is the correlation matrix of R ( n −) for n
=(
)
. he iteratively formed se-
quence of correlation matrices gradually cumulates the variation information to the
leading eigenvectors. At the iteration with rank two, there are only two eigenvectors
letwithnonzeroeigenvalues,andallinformationisreducedtotheellipsespannedby
the two eigenvectors. Every object has its relative position on this two-dimensional
ellipse, and a unique permutation is obtained. Elliptical seriation usually identifies
very good global permutations, and is useful for identifying global clustering pat-
terns and smooth temporal gene expression profiles (Tien et al., )by optimizing
the Robinson criterion.
Local Criterion: Minimal Span Loss Function
he minimal span loss function MS
n
i = d i,i + for a permuted matrix D
focuses on the optimization of local structures. he idea is to find a shortest path
through all data elements, as in the traveling salesman problem. he local seriation
methodproducestighterblocksthantheglobalmethoddoesaroundthemaindiago-
nal of the proximity matrix. In addition, we can combine the anti-Robinson measure
andminimalspanlossintoameasureinwhichabandalongthediagonalofaproxim-
itymatrix isselected with width w
=
=
d ij
(
<
w
<
n
)
,andthe anti-Robinson measurement
is computed within that band.
Tree Seriation
he hierarchical clustering tree with a dendrogram (Eisen et al., ) is the most
popularmethodfortwo-waysortingthegene-by-arraymatrixmapemployedingene
expression profiling. he ordering of terminal nodes generated by an agglomera-
tive hierarchical clustering tree automatically keeps good local grouping structure,
since the tree dendrogram is constructed through a sequential bottom-up merging
of “most similar” subnodes. On the other hand,a divisive hierarchical clustering tree
usually retains better global patterns through a top-down splitting of “most hetero-
geneous” substructures. Divisive hierarchical clustering trees are rarely used due to
their computational complexity.
Flipping of Intermediate Nodes
Onecriticalissuewhenapplyingtheleavesofthedendrograminordertosortthe
rows/columnsofanexpressionprofilematrixistheflippingoftheintermediatenodes.
Asillustrated in Fig. . with a schematic dendrogram (Fig. . a),the n
interme-
diate nodes for a dendrogram of n objects can be flipped independently (Fig. . b),
resulting in ( n −) different dendrogram layouts (Figure . c, for example) and cor-
responding permutations for the n objects with identical proximity matrices (Pear-
son correlation or Euclidean distance) and the same tree linkage method (single,
complete, average or centroid). he flipping mechanism of intermediate nodes can
be guided by either an external or an internal reference list. For example, the Cluster
Search WWH ::




Custom Search