Information Technology Reference
In-Depth Information
It changes the exploration distribution. They showed that neutrality is not gen-
erally a disadvantage concerning the performance, although the search space is
increased.
3.7.2 Estimation of Distribution Algorithms
EDAs do not represent the solutions explicitly like EAs, but with probability dis-
tributions [106], [86], [81]. At the beginning of the 90s the first papers emerged
yielding into the direction of EDAs. Syswerda [151] introduced a GA that makes
use of statistical information gathered from the population. An easy example for
EDAs and bit string representations estimate the setting of each chromosome
with a probability, e.g. being 1. Most continuous EDAs make use of Gaussian
probabilistic density distributions. From these distributions EDAs sample new
solutions, select the best new samples and estimate the new distribution of the
next iteration. We distinguish between univariate EDAs with no interactions
between the variables, bivariate EDAs with pairwise interactions and multivari-
ate EDAs with interactions between more than two variables. Famous instances
of EDAs are compact genetic algorithms, univariate marginal distribution algo-
rithms and population based incremental learning.
In recent years EDAs for continuous optimization problems became more and
more famous [19], [122], [105], [136]. They try to fit the Gaussian model based
on the diversity of the population with the maximum likelihood method. But
it turned out that scaling and adaptation of model parameters for continuous
domains is more complicated than for discrete search spaces. In terms of EDAs
the control of the distribution variance is called adaptive variance scaling .Not
all step size parameter control techniques known for EAs, see chapter 4, are
applicable or have been investigated in the past. Most adaptive variance scaling
techniques are based on Rechenberg's 1/5th success rule [114]. Self-adaptation
is not ecient for EDAs due to the necessarily high number of competing pop-
ulations. The derandomized strategy parameter control of the CMA-ES, section
4.1.6, has not been applied to EDAs yet.
Continuous EDAs using maximum likelihood suffer from premature conver-
gence [49], [164], [100], [50]. Improperly shaped covariance matrices result in bad
success rates and premature shrinking scaling factors. This phenomenon occurs
when the center of the Gaussian model is too far away from the optimum [50] or
when the fitness landscape is statistically deceptive in global scale [164]. Various
techniques have been proposed recently to overcome this problem [22], [49], [20].
3.7.3 Self-Adaptation: Evolving the Estimation of Distribution
The EDA approaches inspire an EDA view on self-adaptation. This interpreta-
tion takes two steps and is exemplified for the step size self-adaptation of ES
in the following. First, the population of an EA may be viewed as a generating
finite mixture density function [27] like in the EDA approach. A similar compari-
son has already been proposed by Rudolph [123] for massively parallel simulated
 
Search WWH ::




Custom Search