Information Technology Reference
In-Depth Information
algorithm (GELF) and the state-of-the-art approach (baseline) [7] on their re-
spective ability to classify unseen data.
The experiment comprises the execution of several workflows. Each of them
processes one of the 20 microarray datasets used for the experiment using a 10-
fold cross-validation scheme. Figure 4a represents one GEAE example workflow.
Sub-experiments, cover a set of several parameters in order to consider various
aspects of methods intended to compare. As can be seen on Figure 4b each
sub-experiment involves 3 types of tasks:
There are two ranker tasks that perform a selection of genes in order to re-
duce the number of features for learning the classifier. The first one uses recursive
feature elimination using support vector machines (called SVM-RFE ), and the
second one returns a random order of features ( RandomR ). The third task
consists on consists on learning and evaluating the performance of the ( GELF )
classifier. GELF is a feature construction algorithm based on iterative improve-
ment of the best solution obtained by the state-of-the-art approach [7].
(a) Scheme of one of the GEAE workflows. (b) Abstract and concrete views of a ML
sub-experiment.
Fig. 4. GEAE workflows. Overview of one of the GEAE workflows (a) and decompo-
sition of for one sub-experiment (b).
Each workflow comprises 20 sub-experiments: both combinations of the GELF
task with the rankers (Random and SVM-RFE) applied on the 10 dataset folds.
As can be seen each workflow application consists of 40 tasks (i.e. 10 Random
ranker executions, 10 SVM-RFE ranker executions and 20 GELF executions).
Considering that we executed the workflows over 20 different datasets. This
gives 800 task executions. To generate a wide spectrum of performance-data
examples, each workflow was executed 10 times on resources of different type.
Table 1 describes the characteristic of the resources used for executing the GEAE
workflows. J avaMFlops, KFlops and MIPS are performance values provided by
the SciMark2 1 , Linpack 2 and Dhrystone [16] benchmarks respectively.
1 SciMark2 benchmark. http://math.nist.gov/scimark2
2 Linpack benchmark. http://www.netlib.org/linpack
Search WWH ::




Custom Search