Database Reference
In-Depth Information
p
0
=
¯
ε
;
for
l
=1;;
l
++
{
a
1
=argmin
a∈
Σ
E
S
(¯
p
l−
1
a
);
p
l−
1
a
l
;
if
(termination criterion fulfilled)
break
;
choose prefix of ¯
p
l
=¯
¯
p
l
for output;
Fig. 1. General framework of greedy algorithms.
p
approximate generalized median string ¯
symbol by symbol. When we are
l
a
l
(
l ≥
a
1
···a
l−
1
(
ε
going to generate the
-th symbol
1), the substring
for
l
= 1) has already been determined. Then, each symbol from Σ is considered
as a candidate for
a
l
. All the candidates are evaluated and the final decision
of
a
l
is made by selecting the best candidate. The process is continued until
a termination criterion is fulfilled.
A general framework of greedy algorithms is given in Figure 1. There
are several possible choices for the termination criterion and the prefix.
The greedy algorithm proposed in [4] stops the iterative construction pro-
cess when
p
l−
1
is regarded the approximate
generalized median string. Alternatively, the author of [22] suggests the
termination criterion
E
S
(¯
p
l
)
>E
S
(¯
p
l−
1
). Then, ¯
l
=max
p∈S
|p|
. The output is the prefix of ¯
p
l
with
the smallest consensus error relative to
. For both variants a suitable data
structure [4] enables a time complexity of
S
n
2
N|
O
(
Σ
|
) for the Levenshtein
distance. The space complexity amounts to
).
In the general framework in Figure 1 nothing is stated about how to
select
O
(
nN|
Σ
|
a
l
if there exist multiple symbols from Σ with the same value of
E
S
(¯
). Besides a simple random choice, the history of the selection
process can be taken into account to make a more reliable decision [22].
p
l−
1
a
5.2.2.
Genetic Search
Genetic search techniques are motived by concepts from the theory of bio-
logical evolution. They are general-purpose optimization methods that have
been successfully applied to a large number of complex tasks. In [17] two
genetic algorithms are proposed to construct generalized median strings.
The first approach is based on a straightforward representation of strings