Information Technology Reference
In-Depth Information
also be useful in a simple artificial evolutionary system. And what about
expression in nature? Is all the information encoded in the genome always
expressed? How is it possible to differentiate the information that gets to be
expressed from the one that remains silent? Why is differentiation impor-
tant? Might this also be of any use in artificial evolutionary systems? Al-
though the answers to all these questions are still being sought, what is known
is that, in nature, genomes are vastly redundant, with lots and lots of so called
junk DNA which is never expressed: highly repetitive sequences, introns,
pseudogenes, and so forth. So, most probably, the introduction of junk se-
quences in an artificial genome can also be useful.
The genetic representation used in gene expression programming explores
both the fragmentation of the genome in genes and the existence of junk
sequences or noncoding regions in the genome. As Kimura hypothesized
(Kimura 1983), the accumulation of neutral mutations plays an important
role in evolution. And the noncoding regions of GEP chromosomes are ideal
places for the accumulation of neutral mutations. In this section, we will
analyze the importance of neutral regions in the genome and, consequently,
the importance of neutral mutations in evolution by using the fully func-
tional genotype/phenotype system of gene expression programming.
For this analysis, two simple, exactly solved test problems were chosen.
These problems can be solved using both unigenic and multigenic systems.
On the one hand, the extent of noncoding regions in unigenic systems can be
easily increased by increasing the gene length. And on the other, in multigenic
systems the number of noncoding regions can be increased by increasing the
number of genes.
The first problem chosen for this analysis is a function finding problem
where the test function (4.1) of section 4.1.1 was used. And the second is a
more difficult sequence induction problem where the test sequence (5.14)
was used (this sequence was also used in sections 12.2 and 12.3).
For the function finding problem, a set of 10 random fitness cases chosen
from the interval [-10, 10] was used (see Table 4.2); the fitness function was
evaluated by equation (3.1b) and a selection range of 25% and a precision of
0.01% were chosen, giving maximum fitness f max = 250; and population sizes
P of 30 individuals and evolutionary times G of 50 generations were chosen.
For the sequence induction problem, as usual, the first 10 positive integers
n and their corresponding a n term were used as fitness cases (see Table 5.5);
the fitness function was also evaluated by equation (3.3b) and a selection
range of 25% and maximum precision (0% error) were chosen, thus giving
Search WWH ::




Custom Search