Biology Reference
In-Depth Information
violated by adding a slightly different sample to the data set. In other words, if α
is a relatively small number, then it is not statistically evident that a i
F r ,and
feature i cannot be used to classify the samples. The significance in choosing the
most representative features and samples comes with the difficulty of problems
that require feature tests and large amounts of samples that are expensive and time
consuming. Some stronger additive and multiplicative consistent biclusterings
can replace the weaker consistent biclustering. Additive consistent biclustering is
introduced in [9] by relaxing (13.3) and (13.4) as
c ir i + c ,
a i
F r =
ξ,ξ
= r,
(13.5)
and
a j
c jr j + c ,
S r =
ξ,ξ
= r,
(13.6)
respectively, where α j > 0 and α i > 0.
Another relaxation in [9] is multiplicative consistent biclustering where (13.3)
and (13.4) are replaced with
c ir i c ,
a i
F r =
ξ,ξ
= r,
(13.7)
and
a j
c jr j c ,
S r =
ξ,ξ
= r,
(13.8)
respectively, where β j > 1 and β i > 1.
Supervised biclustering uses accurate data sets that are called the training set
to classify features to formulate consistent, α -consistent and β -consistent biclus-
tering problems. Then, the information obtained from these solutions can be used
to classify additional samples that are known as the test set . This information is
also useful for adjusting the values of vectors α and β to produce more character-
istic features and decrease the number of misclassifications.
Given a set of training data, construct matrix S and compute the values of
c using (13.1). Classify the features according to the following rule: feature i
belongs to class r (i.e., a i ∈ F r ), if c ir >c ,
∀ξ = r . Finally, construct matrix
F using the obtained classification. Let x i denote a binary variable, which is
one if feature i is included in the computations and zero otherwise. Consistent,
α -consistent and β -consistent biclustering problems are formulated as follows.
Search WWH ::




Custom Search