Information Technology Reference
In-Depth Information
1
|S k |
μ ( c )
jk
=
a ij ,
(6.8)
i
S k
and the mean of all the entries in B k is
μ k =
F k a ij
|F k ||S k |
i
S k
j
.
(6.9)
The residue of the entry a ij in bicluster B k is
a ij μ ( r )
ik μ ( c )
r ij =
jk + μ k ,
(6.10)
the variance of bicluster B k is
B k )=
i ∈S k
2
Va r
(
j ∈F k (
a ij μ k )
,
(6.11)
and mean squared residue of the bicluster B k is
H k = i ∈S k j ∈F k r ij
|F k ||S k |
.
(6.12)
The first approach of biclustering by Hartigan [28] is known as block clustering ,
with the objective function as
K
k = 1 Va r ( B k )=
K
k = 1
i S k
2
min Var
( B )=
j F k (
a ij μ k )
,
where the number of biclusters is a given number. For each bicluster, the variance
Va r
(
B k )
is 0 if it is constant.
CC. Cheng and Church's Algorithm (CC) [11] defines a bicluster to be a sub-
matrix for which the mean squared residue score is below a user-defined threshold
δ
represents the minimum possible value. To find the largest
bicluster in A , they propose a two-phase strategy: removing rows and columns and
then adding the removed rows and columns with some rules. First, the row to be
removed is the one
, i.e., H k δ
, where
δ
1
|F k |
r ij ,
arg max
i
j
F k
and column is
1
|S k | j S k r ij .
Repeating these removing steps until the bicluster with H k δ
arg max
j
obtained. Then some
previously removed rows and columns can be added without violating the require-
ment of H k δ
. Yang et al. [58, 59] proposed an improved version of this algorithm
which allows missing data entry of A with a heuristic flexible overlapped clustering
(FLOC) algorithm.
 
Search WWH ::




Custom Search