Biology Reference
In-Depth Information
where a ij is the expression of i th feature in j th sample.
Biclustering is applied by simultaneous classification of the samples
and features (i.e., columns and rows of matrix A , respectively) into k
classes. Let S 1 ,S 2 ,...,S k denote the classes of the samples (columns) and
F 1 ,F 2 ,...,F k denote the classes of features (rows). Formally biclustering
can be defined as a collection of pairs of sample and feature subsets
B
=
{
( S 1 ,F 1 ) , ( S 2 ,F 2 ) ,..., ( S k ,F k )
}
such that
a j
S 1 ,S 2 ,...,S k ⊆{
} j =1 ,...,n ,
k
a j
S r =
{
} j =1 ,...,n ,
r =1
S ζ S ξ =
∅⇔
ζ
= ξ,
F 1 ,F 2 ,...,F k ⊆{
a i } i =1 ,...,m ,
k
F r =
{
a i } i =1 ,...,m ,
r =1
F ζ F ξ =
∅⇔
ζ
= ξ,
a j
where
{
} j =1 ,...,n and
{
a i } i =1 ,...,m denote the set of columns and rows of the
matrix A , respectively.
The ultimate goal in a biclustering problem is to find a classification for which
samples from the same class have similar values for that class' characteristic
features. The visualization of a reasonable classification should reveal a block-
diagonal or “checkerboard” pattern. A detailed survey on biclustering techniques
can be found in [5] and [8].
The concept of consistent biclustering is introducted in [3]. Formally, a bi-
clustering
is consistent if in each sample (feature) from any set S r (set F r ),
the average expression of features (samples) that belong to the same class r is
greater than the average expression of features (samples) from other classes. The
model for supervised biclustering involves solution of a special case of fractional
0-1 programming problem whose consistency is achieved by feature selection.
Computational results on microarray data mining problems are obtained by refor-
mulating the problem as a linear mixed 0-1 programming problem.
An improved heuristic procedure is proposed in [9], where a linear program-
ming problem with continuous variables is solved at each iteration.
B
Numerical
Search WWH ::




Custom Search