Databases Reference
In-Depth Information
BiclusteringMethods
The previous specification of the types of biclusters only considers ideal cases. In real
data sets, such perfect biclusters rarely exist. When they do exist, they are usually very
small. Instead, random noise can affect the readings of e ij and thus prevent a bicluster in
nature from appearing in a perfect shape.
There are two major types of methods for discovering biclusters in data that may
come with noise. Optimization-based methods conduct an iterative search. At each
iteration, the submatrix with the highest significance score is identified as a bicluster.
The process terminates when a user-specified condition is met. Due to cost concerns
in computation, greedy search is often employed to find local optimal biclusters. Enu-
meration methods use a tolerance threshold to specify the degree of noise allowed in
the biclusters to be mined, and then tries to enumerate all submatrices of biclusters that
satisfy the requirements. We use the
-Cluster and MaPle algorithms as examples to
illustrate these ideas.
OptimizationUsingthe -ClusterAlgorithm
For a submatrix, I J , the mean of the i th row is
X
1
j J j
e iJ D
e ij .
(11.16)
j 2 J
Symmetrically, the mean of the j th column is
X
1
j I j
e Ij D
e ij .
(11.17)
i 2 I
The mean of all elements in the submatrix is
X
X
X
1
j I jj J j
1
j I j
1
j J j
e IJ D
e ij D
e iJ D
e Ij .
(11.18)
i 2 I , j 2 J
i 2 I
j 2 J
The quality of the submatrix as a bicluster can be measured by the mean-squared residue
value as
X
i 2 I , j 2 J .
1
j I jj J j
2 .
H
.
I J
/D
e ij e iJ e Ij C e IJ /
(11.19)
0 is a threshold. When
D 0, I J is a perfect bicluster with coherent values. By setting
Submatrix I J is a
-bicluster if H
.
I J
/
, where
0, a user can
specify the tolerance of average noise per element against a perfect bicluster, because
in Eq. (11.19) the residue on each element is
>
residue
.
e ij /D e ij e iJ e Ij C e IJ .
(11.20)
A maximal
-bicluster is a
-bicluster I J such that there does not exist another
-bicluster I 0 J 0 , and I I 0 , J J 0 , and at least one inequality holds. Finding the
 
Search WWH ::




Custom Search