Biology Reference
In-Depth Information
classify the samples. Sample j is assigned to class r if c jr =max ξ {c }
, i.e.,
S r =
a j
c jr >c ,
= r. (13.4)
As before, the obtained classification S r does not necessarily coincide with clas-
sification S r .
Biclustering
ξ,ξ
is referred to as a consistent biclustering if relations (13.3) and
(13.4) hold for all elements of the corresponding classes, where matrices C S and
C F are defined according to (13.1) and (13.2), respectively.
A data set is biclustering-admitting if some consistent biclustering for it ex-
ists. Furthermore, the data set is called conditionally biclustering-admitting with
respect to a given (partial) classification of some samples and/or features if there
exists a consistent biclustering preserving the given (partial) classification.
B
Theorem 13.1. Let
B
be a consistent biclustering. Then there exist convex cones
m such that only samples from S r belong to the corresponding
cone P r , r =1 ,...,k . Similarly, there exist convex cones Q 1 ,Q 2 ,...,Q k R
P 1 ,P 2 ,...,P k R
n
such that only features from class F r belong to the corresponding cone Q r , r =
1 ,...,k .
See [3] for the proof of Theorem 13.1. It also follows from the proven conic
separability that convex hulls of classes do not intersect.
By definition, a biclustering is consistent if F r = F r and S r = S r .However,
a given data set might not have these properties. The features and/or samples in
the data set might not clearly belong to any of the classes and hence a consistent
biclustering might not be constructed. In such cases, one can remove a set of
features and/or samples from the data set so that there is a consistent biclustering
for the truncated data. Selection of a representative set of features that satisfies
certain properties is a widely used technique in data mining applications. This
feature selection process may incorporate various objective functions depending
on the desirable properties of the selected features, but one general choice is to
select the maximal possible number of features in order to lose minimal amount
of information provided by the training set.
A problem with selecting the most representative features is the following.
Assume that there is a consistent biclustering for a given data set, and there is a
feature, i , such that the difference between the two largest values of c ir is negligi-
ble, i.e.,
c ir
c }≤
min
ξ
= r {
α,
where α is a small positive number. Although this particular feature is classified
as a member of class r (i.e., a i
F r ), the corresponding relation (13.3) can be
Search WWH ::




Custom Search