Information Technology Reference
In-Depth Information
xMotif. In the framework proposed by Murali and Kasif [39], biclusters are
defined such that samples are nearly constantly expressed across the selection of
features. In first step, the input matrix is preprocessed by assigning each sam-
ple a set of statistically significant states. These states define the set of valid bi-
clusters: A bicluster is a submatrix where each sample is exactly in the same
state for all selected features. To identify the largest valid biclusters, an itera-
tive search method is proposed that is run on different random seeds, similarly to
ISA.
OREO. DiMaggio Jr. et al. [19] proposed an algorithm of optimal re-ordering
(OREO) of the rows and columns of the data matrix A to biclustering. The idea of
OREO is to optimally rearrange the rows and columns of data matrix A to minimize
the similarities between rows and columns in the rearranged matrix. The algorithm
has three main iterative steps: optimally re-ordering rows (or columns) of the data
matrix; computing the median for each pair of neighboring rows (or columns) in
the final rearranged matrix, sorting these values from highest to lowest and classi-
fying cluster boundaries between the rows (or columns) to obtain submatrices; and
optimally re-ordering the columns (or rows) of each submatrix and computing the
cluster boundaries for the re-ordered columns (or rows) analogous to the second
step.
Here we use rows to reorder, and the authors [19] defined three associated cost
measurement functions between row i and row i :
m
j = 1 | a ij a i j |,
m
j = 1 ( a ij a i j )
j (
a ij
a i j )
2
2
c ii =
,
.
m
The authors [19] use two models to reorder rows in order to minimize the total
similarities between rows of final rearranged matrix: the network flow model and
TSP model, which are ideas from network optimization. In the network flow model,
defining the binary variables
1
if row i is adjacent and above i in the final ordering;
,
y row
=
ii
0
,
otherwise,
and two additional ones for the topmost and bottommost rows
1
,
if row i is the topmost row in the final ordering;
y source row
i
=
0
,
otherwise,
1
,
if row i is the bottommost row in the final ordering;
y sink row
=
i
0
,
otherwise,
and choosing one of the three associated cost measurement functions, the optimiza-
tion problem is to find solution to binary variables y row
ii
y sink row
i
y source row
i
,
,
,
 
Search WWH ::




Custom Search