Biology Reference
In-Depth Information
tissue samples are 27 and 13 in the training set and 10 and 7 in the test
set, respectively.
5.2.2.2. Leukemia data
The original data were downloaded from the Internet (http://www.broad.
mit.edu/cgi-bin/cancer/datasets.cgi). The data contain the expression levels
of 7129 genes across 72 samples, of which 47 are the ALL samples and 25
are the AML samples. These datasets contain measurements corresponding
to ALL and AML samples from bone marrow and peripheral blood, which
were divided into a training set (38 samples) and a test set (34 samples).
5.3. Overall Methodology
The proposed SDL global optimization method in this study includes the
following major steps:
(1) sampling within search spaces by using a suitable orthogonal array
instead of conducting a random search;
(2) constructing an objective function for optimization algorithms;
(3) using search space reduction strategies;
(4) searching for global optimal solutions;
(5) building up a multi-subset pyramidal hierarchy class predictor for
classification; and
(6) predicting through a voting mechanism.
5.3.1. Orthogonal arrays (OAs) and sampling procedure
OAs were discovered and introduced in the middle of the last century (Rao,
1946; Rao, 1947; Rao, 1949). Many statistical texts on experimental
designs include OAs (Cochran and Cox, 1957; Montgomery, 1997). OAs
are often employed in industrial experiments to study the effect of several
control factors. An OA is a type of experiment where the columns for
the independent variables are “orthogonal” to one another.
An OA is a matrix of n rows and k columns, with every element being
one of the q levels, and is normally represented in the form of L n (k q ) . The
Search WWH ::




Custom Search