Database Reference
In-Depth Information
Table 1. Characteristic of median computation algorithms.
Algorithm
String distance
(original paper)
Extension
to arbitrary
distance
Handling
weighted
median
Handling
center
median
Exact algorithm and its
variants [21,24,25]
Levenshtein
No
Yes
No
Greedy algorithms
[4,22]
Levenshtein
Yes
Yes
Yes
Genetic algorithm [17]
Levenshtein
Yes
Yes
Yes
Dynamic algorithm [18]
Levenshtein
No
Yes
No
perturbation-based
iterative refinement
[20,26]
Arbitrary
distance
N/A
Yes
Yes
The majority of the algorithms described in this paper are based on
the Levenshtein edit distance. The algorithms' applicability to an arbitrary
string distance function is summarized in Table 1. Note that an extension
to an arbitrary string distance function usually means a computational
complexity different from that for the Levenshtein edit distance.
In the definition of median string, all the input strings have a uniform
weight of one. If necessary, this basic definition can be easily extended to
weighted median string to model the situation where each string has an
individual importance, confidence, etc. Given the weights
w q ,q ∈ S
,the
weighted generalized median string is simply
p
¯
=arg min
p∈U
w q · d
(
p, q
)
.
q = S
All the computational procedures discussed before can be modified to
handle this extension in a straightforward manner.
The generalized median string represents one way of capturing the essen-
tial characteristics of a set of strings. There do exist other possibilities. One
example is the so-called center string [15] defined by:
p =arg min
p∈U
max
q∈S
d
(
p, q
)
.
It is important to note that the same term is used in [13] to denote the
set median string. Under the two conditions given in Section 3, it is proved
in [15] that computing the center string is NP-hard. Another result is given
in [7] where the NP-hardness of the center string problem is proved for
the special case of a binary alphabet (i.e., Σ =
) and the Hamming
string distance. The ability of the algorithms to compute the center string
is summarized in Table 1.
{
0
,
1
}
Search WWH ::




Custom Search