Evolutionary Studies - Gene Expression Programming

Information Technology Reference

In-Depth Information

also be useful in a simple artificial evolutionary system. And what about

expression in nature? Is all the information encoded in the genome always

expressed? How is it possible to differentiate the information that gets to be

expressed from the one that remains silent? Why is differentiation impor-

tant? Might this also be of any use in artificial evolutionary systems? Al-

though the answers to all these questions are still being sought, what is known

is that, in nature, genomes are vastly redundant, with lots and lots of so called

junk DNA which is never expressed: highly repetitive sequences, introns,

pseudogenes, and so forth. So, most probably, the introduction of junk se-

quences in an artificial genome can also be useful.

The genetic representation used in gene expression programming explores

both the fragmentation of the genome in genes and the existence of junk

sequences or noncoding regions in the genome. As Kimura hypothesized

(Kimura 1983), the accumulation of neutral mutations plays an important

role in evolution. And the noncoding regions of GEP chromosomes are ideal

places for the accumulation of neutral mutations. In this section, we will

analyze the importance of neutral regions in the genome and, consequently,

the importance of neutral mutations in evolution by using the fully func-

tional genotype/phenotype system of gene expression programming.

For this analysis, two simple, exactly solved test problems were chosen.

These problems can be solved using both unigenic and multigenic systems.

On the one hand, the extent of noncoding regions in unigenic systems can be

easily increased by increasing the gene length. And on the other, in multigenic

systems the number of noncoding regions can be increased by increasing the

number of genes.

The first problem chosen for this analysis is a function finding problem

where the test function (4.1) of section 4.1.1 was used. And the second is a

more difficult sequence induction problem where the test sequence (5.14)

was used (this sequence was also used in sections 12.2 and 12.3).

For the function finding problem, a set of 10 random fitness cases chosen

from the interval [-10, 10] was used (see Table 4.2); the fitness function was

evaluated by equation (3.1b) and a selection range of 25% and a precision of

0.01% were chosen, giving maximum fitness f max = 250; and population sizes

P of 30 individuals and evolutionary times G of 50 generations were chosen.

For the sequence induction problem, as usual, the first 10 positive integers

n and their corresponding a n term were used as fitness cases (see Table 5.5);

the fitness function was also evaluated by equation (3.3b) and a selection

range of 25% and maximum precision (0% error) were chosen, thus giving

Search WWH ::

Custom Search

Home