Self-Adaptation - Self-Adaptive Heuristics for Evolutionary Computation

Information Technology Reference

In-Depth Information

It changes the exploration distribution. They showed that neutrality is not gen-

erally a disadvantage concerning the performance, although the search space is

increased.

3.7.2 Estimation of Distribution Algorithms

EDAs do not represent the solutions explicitly like EAs, but with probability dis-

tributions [106], [86], [81]. At the beginning of the 90s the first papers emerged

yielding into the direction of EDAs. Syswerda [151] introduced a GA that makes

use of statistical information gathered from the population. An easy example for

EDAs and bit string representations estimate the setting of each chromosome

with a probability, e.g. being 1. Most continuous EDAs make use of Gaussian

probabilistic density distributions. From these distributions EDAs sample new

solutions, select the best new samples and estimate the new distribution of the

next iteration. We distinguish between univariate EDAs with no interactions

between the variables, bivariate EDAs with pairwise interactions and multivari-

ate EDAs with interactions between more than two variables. Famous instances

of EDAs are compact genetic algorithms, univariate marginal distribution algo-

rithms and population based incremental learning.

In recent years EDAs for continuous optimization problems became more and

more famous [19], [122], [105], [136]. They try to fit the Gaussian model based

on the diversity of the population with the maximum likelihood method. But

it turned out that scaling and adaptation of model parameters for continuous

domains is more complicated than for discrete search spaces. In terms of EDAs

the control of the distribution variance is called adaptive variance scaling .Not

all step size parameter control techniques known for EAs, see chapter 4, are

applicable or have been investigated in the past. Most adaptive variance scaling

techniques are based on Rechenberg's 1/5th success rule [114]. Self-adaptation

is not ecient for EDAs due to the necessarily high number of competing pop-

ulations. The derandomized strategy parameter control of the CMA-ES, section

4.1.6, has not been applied to EDAs yet.

Continuous EDAs using maximum likelihood suffer from premature conver-

gence [49], [164], [100], [50]. Improperly shaped covariance matrices result in bad

success rates and premature shrinking scaling factors. This phenomenon occurs

when the center of the Gaussian model is too far away from the optimum [50] or

when the fitness landscape is statistically deceptive in global scale [164]. Various

techniques have been proposed recently to overcome this problem [22], [49], [20].

3.7.3 Self-Adaptation: Evolving the Estimation of Distribution

The EDA approaches inspire an EDA view on self-adaptation. This interpreta-

tion takes two steps and is exemplified for the step size self-adaptation of ES

in the following. First, the population of an EA may be viewed as a generating

finite mixture density function [27] like in the EDA approach. A similar compari-

son has already been proposed by Rudolph [123] for massively parallel simulated

Search WWH ::

Custom Search

Home