Biology Reference
In-Depth Information
where
a
ij
is the expression of
i
th
feature in
j
th
sample.
Biclustering is applied by simultaneous classification of the samples
and features (i.e., columns and rows of matrix
A
, respectively) into
k
classes. Let
S
1
,S
2
,...,S
k
denote the classes of the samples (columns) and
F
1
,F
2
,...,F
k
denote the classes of features (rows). Formally biclustering
can be defined as a collection of pairs of sample and feature subsets
B
=
{
(
S
1
,F
1
)
,
(
S
2
,F
2
)
,...,
(
S
k
,F
k
)
}
such that
a
j
S
1
,S
2
,...,S
k
⊆{
}
j
=1
,...,n
,
k
a
j
S
r
=
{
}
j
=1
,...,n
,
r
=1
S
ζ
S
ξ
=
∅⇔
ζ
=
ξ,
F
1
,F
2
,...,F
k
⊆{
a
i
}
i
=1
,...,m
,
k
F
r
=
{
a
i
}
i
=1
,...,m
,
r
=1
F
ζ
F
ξ
=
∅⇔
ζ
=
ξ,
a
j
where
{
}
j
=1
,...,n
and
{
a
i
}
i
=1
,...,m
denote the set of columns and rows of the
matrix
A
, respectively.
The ultimate goal in a biclustering problem is to find a classification for which
samples from the same class have
similar
values for that class' characteristic
features. The visualization of a reasonable classification should reveal a block-
diagonal or “checkerboard” pattern. A detailed survey on biclustering techniques
can be found in [5] and [8].
The concept of
consistent biclustering
is introducted in [3]. Formally, a bi-
clustering
is consistent if in each sample (feature) from any set
S
r
(set
F
r
),
the average expression of features (samples) that belong to the same class
r
is
greater than the average expression of features (samples) from other classes. The
model for supervised biclustering involves solution of a special case of fractional
0-1 programming problem whose consistency is achieved by feature selection.
Computational results on microarray data mining problems are obtained by refor-
mulating the problem as a linear mixed 0-1 programming problem.
An improved heuristic procedure is proposed in [9], where a linear program-
ming problem with continuous variables is solved at each iteration.
B
Numerical
Search WWH ::
Custom Search