Genetic Programming - GeoComputation

Geoscience Reference

In-Depth Information

being selected. The effect of poorly optimised software settings will most likely be the production of

an inadequate solution - at this point, it is worthwhile remembering the Frog King analogy presented

at the start of the chapter in that evolving a good solution can require a lot of time and effort.

Once an initial population of chromosomes is produced, the next step is an automated assessment

by the software of how well each existing solution can predict the dependent variable. This is done by

applying an evaluation measure or objective function equation which in most cases is used to calcu-

late either absolute error, root mean squared error (RMSE), R -squared, or Pearson's product moment

correlation coefficient (see Dawson et al., 2007) or perhaps a pertinent relative error metric (Ferreira,

2006b). For more information on these statistics, see Abrahart et al. (2011) or Bellocchi et al. (2010).

A further measure called fitness is similarly determined. This fitness metric is a dimensionless output

calculated using a separate equation, which is usually displayed as a summary of the best individual

in a given generation to assist user evaluation of the progress of solution development during a GP run.

Following assessment, before any intergenerational adaptations to chromosomes are made, a cohort

is selected from the initial (and, thereafter, from a subsequent) population of potential solutions. The

most common selection methods include tournament selection, fitness proportionate selection and rou-

lette wheel selection with elitism (Ferreira, 2001; Poli et al., 2008). In tournament selection, a cohort

is randomly selected. Individuals are thereafter put into pairs and the fitness of each member com-

puted and compared. Each individual possessing a higher fitness is thereafter identified and adapted.

Roulette wheel selection involves ranking the fitness of individuals and, subsequently, using a roulette

wheel approach to sampling and selection, in which a greater proportion of the wheel is allocated to

the fittest individuals. This means that chromosomes with a higher fitness will have a greater chance

of being selected, in contrast to ones with a lower fitness. In GEP, clones are also made of the fittest

individuals, to ensure that good traits are not lost as a result of the selection process (Ferreira, 2001). In

some cases, intergenerational domination by a small number of individuals can manifest itself, mean-

ing the search process can get trapped in a local optimum. Most popular software packages, however,

have introduced specific ways and means for addressing such a problem. By performing a succession

of runs, it is also possible to exploit the stochastic nature of GP and so help avoid such situations.

Adaptations made to the selected chromosomes are controlled probabilistically using special

functions called genetic operators. These operators behave in a similar way to the modifications

made to the DNA of biological organisms during breeding. The main genetic operators are cross-

over, mutation, inversion and transposition - with the dominant types being crossover and mutation.

In each case, part of the chromosome is selected and altered by some kind of cutting and replac-

ing routine. Simple schematic diagrams of these operations are shown in Figure 8.4. In crossover

(called recombination in GEP), two chromosomes are selected as the parents for two offspring.

Then one or more randomly selected fragments of each chromosome are selected, cut and replaced

by their respective counterparts. In mutation, randomly selected fragments of individual chromo-

somes are selected and substituted with randomly evolved replacements. Fragments can comprise

individual mathematical functions, constants or variables, or indeed a larger composition of any

such components. In inversion, components residing inside a randomly selected fragment, consist-

ing of mathematical functions, constants and/or variables, are simply swapped around - sometimes

having a neutral effect depending on the functions involved. Transposition involves selecting one

part of a chromosome and, thereafter, moving that component to a different position in the same

chromosome. Note, however, that duplication is permitted, such that the original component might

remain intact, and that as a result of insertion, other parts of a potential solution could end up being

obliterated and lost for good. This process is implemented in GEP, as a mechanism that is used for

identifying an inactive part of a chromosome, which is subsequently moved to an active position.

The operators described here outline some of the concepts which underpin evolutionary and gen-

erational changes made to chromosomes in GP. Detailed technical descriptions of these and other

operators can be found in relevant core texts (Ferreira, 2006; Goldberg, 1989; Koza, 1990).

The cycle of selection and optimisation typically continues for 1000+ generations until a desired

fitness is reached or the predefined stopping point is satisfied. The stopping point is a user-defined

GeoComputation

Search WWH ::

Custom Search

Home