Information Technology Reference
In-Depth Information
indeed a good example of how gene expression programming can be suc-
cessfully used for extracting knowledge from huge databases and designing
good predictive models.
4.2.3 Fisher's Irises
In this classification problem the goal is to classify three different types of
irises based on four measurements: sepal length, sepal width, petal length,
and petal width. The iris dataset contains fifty examples each of three types
of iris: Iris setosa , Iris versicolor , and Iris virginica (Fisher 1936).
Classification problems with more than two classes, say n classes, can be
solved by GEP using two different approaches. The first one requires de-
composing the data into n separate 0/1 classification problems. And the sec-
ond explores the multigenic nature of gene expression programming to solve
problems of multiple outputs in one go, that is, chromosomes composed of n
different genes are used to design a classification model composed of n dif-
ferent sub-models, where each sub-model is responsible for the identifica-
tion of a particular class.
The second approach is very appealing and the GEP system with multiple
outputs (GEP-MO) is indeed very efficient at solving relatively complex
problems of this kind. But, for really complex problems (say, problems with
more than 10 different classes and/or more than 50 attributes), the first ap-
proach, although more time-consuming, is a better choice as it is much more
flexible in terms of both the structure and the composition of each sub-model.
Both approaches will be compared below on the iris dataset.
Decomposing a Three-class Problem
The classification of data into n distinct classes C requires processing the
data into n separate 0/1 classification problems as follows:
1. C 1 versus NOT C 1
2. C 2 versus NOT C 2
...
n . C n versus NOT C n
Then n different sub-models are evolved separately and afterwards com-
bined in order to create the final model.
For the iris data we are going to decompose our problem into three sepa-
rate 0/1 classification problems. The first one is Iris setosa versus NOT Iris
Search WWH ::




Custom Search