Database Reference
In-Depth Information
and Euclidean distance) results when
T
=
∅
and
L
p
=
p
l
i
=1
|x
i
− y
i
|
p
d
0
(
xy
)=
(6)
where
.
Editing distance (or Levenshtein distance) is the weight of the mini-
mum sequence of editing operations needed to transform one sequence into
another [Sankoff and Kruskal (1999)]. It is usually defined on strings (or
equi-spaced time sequences), but Mannila and Ronkainen (1997) show how
to generalise this measure to general (non equi-spaced) time sequences. In
the framework given above, editing distance may be defined as:
|x|
=
|y|
=
l
c
(del(
x
1
)) +
d
ed
(
x
2:
m
,y
)
d
ed
(
x, y
)=min
c
(del(
y
1
)) +
d
ed
(
x, y
2:
n
)
(7)
c
(sub(
x
1
y
1
)) +
d
ed
(
x
2:
m
,y
2:
n
)
where
m
=
|x|
,
n
=
|y|
,del(
x
1
) and del(
y
1
) stand for deleting the first
elements of
x
and
y
, respectively, and sub (
x
1
,
y
1
) stands for substituting
the first element of
.
A distance function with time warping allows non-uniform scaling along
the time axis, or, in sequence terms,
stuttering
. Stuttering occurs when
an element from one of the sequences is repeated several times. A typical
distance measure is:
x
with the first element of
y
d
tw
(
x, y
2:
n
)
(
x
- stutter)
,
d
tw
(
x, y
)=
d
0
(
x
1
,y
1
)+min
d
tw
(
x
2:
m
,y
)(
y
- stutter)
,
(8)
d
tw
(
x
2:
m
,y
2:
n
) (no stutter)
.
(
mn
)) using
dynamic programming [Cormen
et al
. (1993), Sankoff and Kruskal (1999)]:
An
Both
d
ed
and
d
tw
can be computed in quadratic time (
O
m × n
D
D
i, j
d
x
1:
i
,y
1:
j
). The final
table
is filled iteratively so that
[
]=
(
distance
].
The
Longest Common Subsequence
(LCS) measure [Cormen
et al.
(1993)],
d
(
x
,
y
), is found in
D
[
m, n
d
lcs
(
x, y
), is the length of the longest sequence
s
which is a
(possibly non-contiguous) subsequence of both
x
and
y
, in other words:
)=max
|s |
s ⊆ x, s ⊆ y
.
d
lcs
(
x, y
(9)
In some applications the measure is normalised by dividing by
max(
)maybecalcu-
lated using dynamic programming, in a manner quite similar to
|x|, |y|
), giving a distance in the range [0, 1].
d
lcs
(
x
,
y
d
ed
.