Information Technology Reference
In-Depth Information
where
μ k is the main effect of bicluster k , and
α ik and
β jk are the effects of sample i
and feature j , respectively, in bicluster k ,
ε ijk is the noise term for bicluster k , and e ij
models the data points that do not belong to any bicluster. Here
δ ik , κ jk are binary
variables:
δ ik =
1 indicates that row i belongs to bicluster k , and
δ ik =
0 otherwise;
similarly,
0 otherwise.
In plain model [50], the entry a ij has similar assumption with less factors to be
considered.
In nonoverlapping feature biclustering,
κ jk =
1 indicates that column j is in cluster k , and
κ jk =
K
k
κ jk
1, and in nonoverlapping sam-
=
1
K
k
ple biclustering,
δ jk
1. Here, nonoverlapping sample is discussed. The priors
=
1
of the indicators
are set so that a feature can be in multiple biclusters while
sample is at more than one.
In this model, an observation a ij can belong to either one or none of the biclusters,
and the probability distribution of a ij conditional on the bicluster indicators can be
rewritten as
κ
and
δ
2
ε k
a ij | δ ik =
1
, κ jk =
1
N
( μ k + α ik + β jk , σ
)
if a ij belongs to bicluster k ; otherwise,
2
e
a ij | δ ik κ jk =
0 for all k
N
(
0
, σ
) .
With Gaussian zero-mean priors on the effect parameters, the marginal distribu-
tion of the a ij conditional on the indicators is
B| δ , κ
N
(
0
, Σ ) ,
T
where
Σ
is the covariance of matrix of
B
and
B = {
B 0 ,
B 1 ,
B 2 , ··· ,
B K }
with B k =
{
1 and B 0 being the vector of data points belonging to no
bicluster. More specifically,
a ij :
δ ik κ jk =
1
},
k
Σ
is a sparse matrix of the form
e I 0
σ
···
0
,
0
Σ 1 ···
0
Σ =
.
.
.
. . .
00
··· Σ K
where
Σ k =
Cov
(
B k ,
B k )
is the covariance matrix of all data points belonging to
cluster k .
To make inference form above BBC model, the implemented Gibbs sampling
method is used. Initializing from a set of randomly assigned values of
δ
's and
κ
's,
the column indicators
κ
are sampled by calculating the log-probability ratio
2
μ
2
α
2
2
ε
e
P
(
V 2 | κ jk =
1
, σ
, σ
, σ
β k , σ
, σ
)
P
( κ jk =
1
)
k
k
k
log
) ,
2
2
2
β
2
P
(
V 2 | κ jk =
0
, σ
μ k , σ
α k , σ
k , σ
ε k , σ
e
)
P
( κ jk =
0
where V 1 = {
a il :
δ ik =
0or
κ lk =
0
,
l
=
j
}
, the set contains data points not in cluster
k , and V 2 = {
, the set contains data points
that are or can in bicluster k . This notation follows that in [26].
a il :
δ ik =
1
, κ lk =
1
,
l
=
j
}∪{
a ij :
δ ik =
1
}
Search WWH ::




Custom Search