Database Reference
In-Depth Information
in terms of chromosomes by one-dimensional arrays of varying lengths con-
sisting of symbols from Σ. The string “median”, for instance, is coded as
chromosome
m
e
d
i
a
n
) each, the fitness func-
tion is simply defined by the consensus error of a chromosome string. The
roulette wheel selection is used to create offspring. For crossover the single-
point operator is applied. With a mutation probability either deletion,
insertion or substitution is performed at each position of a chromosome.
The input strings are used as part of the initial population. The rest of
the initial population is filled by randomly selecting some input strings and
applying the mutation operator on them. The population evolution process
terminates if one of the following two criteria is fulfilled. The first crite-
rion is that the maximal number of generations has been reached. Second,
if the population becomes very homogeneous, the process terminates as well.
The homogeneity is measured by means of the average and the variance of
the fitness values of a population. If their product is smaller than a pre-
specified threshold, the population is considered enough homogeneous to
stop the evolution. Let
Given a set
S
of
N
input strings of length
O
(
n
the replacement
percentage. When using the Levenshtein edit distance, the time complexity
of this genetic algorithm amounts to
P
denote the population size and
p
) per generation.
The genetic algorithm above requires quite a large number of string dis-
tance computations due to the fitness function evaluation in each iteration.
For better computational eciency a second genetic algorithm is designed
using a more elaborate chromosome coding scheme of strings; see [17] for
details. The time complexity becomes
O
(
Nn 2 pP
O
(
NnmpP
),
m<<n
, implying a
substantial speedup.
5.2.3. Perturbation-Based Iterative Refinement
The set median represents an approximation of the generalized median
string. The greedy algorithms and the genetic search techniques also give
approximate solutions. An approximate solution ¯
can be further improved
by an iterative process of systematic perturbations. This idea was first
suggested in [20]. But no algorithmic details are specified there. A concrete
algorithm for realizing systematic perturbations is given in [26]. For each
p
 
Search WWH ::




Custom Search