Information Technology Reference
In-Depth Information
PQQQTcQabcbbcabbbbbca2110399977-[30]
C = {0.73, 1.58, 4.93, 3.13, 7.95, 0.81, 5.06, 0.62, 5.84, 1.79}
They were obtained using exactly the same settings presented in Table 9.6,
except that the number of generations was an order of magnitude higher and
the trees were pruned by parsimony pressure (we'll talk more about this in
section 9.4, Pruning Trees with Parsimony Pressure). And as you can see, 25
out of 30 decision trees have the attribute PETAL_LENGTH (“P”) at the
root; four of them have the PETAL_WIDTH (“Q”); and even the
SEPAL_LENGTH (“S”) can be found at the helm of one tree (DT 2). Note,
however, that none of these accurate DTs starts with the SEPAL_WIDTH
(“T”) at the root. Thus, the decision trees of gene expression programming
have no constraints whatsoever about the kind of attribute that is placed at
the root (and of course at either of the other nodes): they explore all possi-
bilities with equal freedom and, therefore, are able to search in all the cran-
nies of the solution space, increasing the odds of finding the perfect solution.
Also interesting is that, in the same experiment, two decision trees were
created that classify correctly all the 150 irises. Their structures are shown
below:
PQQaQcQbTPccbbcaaabba5702028126-[1]
C = {1.5, 5.5, 3.03, 4.87, 4.1, 4.93, 5.63, 0.64, 1.74, 8.26}
PPQaQcQbTSccbbcabcabc8700094491-[2]
C = {1.53, 6.96, 8.5, 1.52, 1.72, 7.61, 0.03, 2.81, 4.96, 3.14}
As you can see, both these DTs involve a total of 15 nodes each and both of
them start with the PETAL_LENGTH at the root. It is also worth knowing
that the first of these perfect DTs is a descendant of DT 18 above and that the
second is a descendant of DT 28. And by drawing these trees, it becomes
clear that, in both cases, a perfect solution was created thanks to a more
precise discrimination between Iris versicolor and Iris virginica . As an illus-
tration, the DT 28 and its fitter descendant are shown in Figure 9.17.
Although we know already that, as a rule, GEP decision trees generalize
extremely well, it is also interesting to see how GEP decision trees general-
ize with the iris dataset. For that purpose, the original iris dataset was
randomized and then 100 instances were selected for training with the re-
maining 50 used for testing (these datasets are available for download at the
gene expression programming website). And as expected, the decision trees
trained on this dataset generalize outstandingly well. For instance, the deci-
sion tree below classifies correctly 99 instances on the training set and
Search WWH ::




Custom Search