Geoscience Reference
In-Depth Information
being selected. The effect of poorly optimised software settings will most likely be the production of
an inadequate solution - at this point, it is worthwhile remembering the Frog King analogy presented
at the start of the chapter in that evolving a good solution can require a lot of time and effort.
Once an initial population of chromosomes is produced, the next step is an automated assessment
by the software of how well each existing solution can predict the dependent variable. This is done by
applying an evaluation measure or objective function equation which in most cases is used to calcu-
late either absolute error, root mean squared error (RMSE), R -squared, or Pearson's product moment
correlation coefficient (see Dawson et al., 2007) or perhaps a pertinent relative error metric (Ferreira,
2006b). For more information on these statistics, see Abrahart et al. (2011) or Bellocchi et al. (2010).
A further measure called fitness is similarly determined. This fitness metric is a dimensionless output
calculated using a separate equation, which is usually displayed as a summary of the best individual
in a given generation to assist user evaluation of the progress of solution development during a GP run.
Following assessment, before any intergenerational adaptations to chromosomes are made, a cohort
is selected from the initial (and, thereafter, from a subsequent) population of potential solutions. The
most common selection methods include tournament selection, fitness proportionate selection and rou-
lette wheel selection with elitism (Ferreira, 2001; Poli et al., 2008). In tournament selection, a cohort
is randomly selected. Individuals are thereafter put into pairs and the fitness of each member com-
puted and compared. Each individual possessing a higher fitness is thereafter identified and adapted.
Roulette wheel selection involves ranking the fitness of individuals and, subsequently, using a roulette
wheel approach to sampling and selection, in which a greater proportion of the wheel is allocated to
the fittest individuals. This means that chromosomes with a higher fitness will have a greater chance
of being selected, in contrast to ones with a lower fitness. In GEP, clones are also made of the fittest
individuals, to ensure that good traits are not lost as a result of the selection process (Ferreira, 2001). In
some cases, intergenerational domination by a small number of individuals can manifest itself, mean-
ing the search process can get trapped in a local optimum. Most popular software packages, however,
have introduced specific ways and means for addressing such a problem. By performing a succession
of runs, it is also possible to exploit the stochastic nature of GP and so help avoid such situations.
Adaptations made to the selected chromosomes are controlled probabilistically using special
functions called genetic operators. These operators behave in a similar way to the modifications
made to the DNA of biological organisms during breeding. The main genetic operators are cross-
over, mutation, inversion and transposition - with the dominant types being crossover and mutation.
In each case, part of the chromosome is selected and altered by some kind of cutting and replac-
ing routine. Simple schematic diagrams of these operations are shown in Figure 8.4. In crossover
(called recombination in GEP), two chromosomes are selected as the parents for two offspring.
Then one or more randomly selected fragments of each chromosome are selected, cut and replaced
by their respective counterparts. In mutation, randomly selected fragments of individual chromo-
somes are selected and substituted with randomly evolved replacements. Fragments can comprise
individual mathematical functions, constants or variables, or indeed a larger composition of any
such components. In inversion, components residing inside a randomly selected fragment, consist-
ing of mathematical functions, constants and/or variables, are simply swapped around - sometimes
having a neutral effect depending on the functions involved. Transposition involves selecting one
part of a chromosome and, thereafter, moving that component to a different position in the same
chromosome. Note, however, that duplication is permitted, such that the original component might
remain intact, and that as a result of insertion, other parts of a potential solution could end up being
obliterated and lost for good. This process is implemented in GEP, as a mechanism that is used for
identifying an inactive part of a chromosome, which is subsequently moved to an active position.
The operators described here outline some of the concepts which underpin evolutionary and gen-
erational changes made to chromosomes in GP. Detailed technical descriptions of these and other
operators can be found in relevant core texts (Ferreira, 2006; Goldberg, 1989; Koza, 1990).
The cycle of selection and optimisation typically continues for 1000+ generations until a desired
fitness is reached or the predefined stopping point is satisfied. The stopping point is a user-defined
Search WWH ::




Custom Search