Information Technology Reference
In-Depth Information
Note that three of the terms of this model match exactly the target function
(4.6). Note, however, that the term
c
was discovered by GEP as
cos
sin
(
)
an excellent approximation to
ex1
. Indeed, in the testing set, this model
has a fitness of 999.9907 and an R-square of 0.99998534, which tells us that
this model has excellent generalizing capabilities and is indeed an almost
perfect match to the target function (4.6). Indeed, as we will see again and
again in this topic, gene expression programming can be used to find very
good solutions to problems of great complexity.
c
4.1.3 Mining Meaningful Information from Noisy Data
Tools for mining knowledge from huge databases are crucial in a world
where data is constantly increasing. The quantity of data is so big that to
find the meaningful factors in the sea of data becomes a Herculean task
and new technologies have been developed to extract relevant knowledge
from these huge databases. Gene expression programming is one of these
emerging technologies and is ideal for separating the wheat from chaff. In
this section we are going to illustrate very clearly how this can be success-
fully achieved with a function finding problem where nine out of ten vari-
ables are meaningless.
The test function we are going to use in this experiment is the already
familiar function of section 4.1.1, with the difference that the meaningful
parameter is to be discovered among a total of 10 variables. In Table 4.4 are
shown both the performance and the parameters used per run in this experi-
ment. And as you can see by the high success rate obtained (77%), gene
expression programming was not overwhelmed by the quantity of irrelevant
data and found its way around this huge amount of irrelevant information
very efficiently.
The first perfect solution was created in generation 61 of run 0. Its chro-
mosome is shown below (the sub-ETs are linked by addition):
0123456789012
*a*aa-hgadadc
-ah*d-gcfjcbd
/--gcgciijeeg
h+eeehbeddbfd
*aadaabcecfgb
(4.8a)