Database Reference
In-Depth Information
Fig. 11. If we can afford to be within a certain error range then we don't have to try
all
translations.
We can now prove our main theorem:
Theorem 2.
Given two sequences A and B, with
|A|
=
n
and
|B|
=
m
(suppose
m<n
), and a constant
0
<β<
1
, we can find an approxima-
tion
AS
2
δ,β
(
A, B
)
of the similarity
S
2(
δ, ε, A, B
)
such that
S
2(
δ, ε, A, B
)
−
nδ
2
/β
AS
2
δ,β
(
A, B
)
<β
in
O
(
)
time.
Proof:
Let
a
=
S
2(
δ, ε, A, B
). There exists a translation
f
i
such that
L
fi
is a superset of the matches in the optimal
LCSS
of
A
and
B
. In addition,
by the previous lemma, there are 2
b
translations (
f
i−b
,...,f
i
+
b
)thathave
at most
different matchings from the optimal.
Therefore, if we use the translations
b
,...,
δ
b
in the order-
f
ib
,for
i
=1
ing described above, we are within
b
different matchings from the opti-
mal matching of
A
and
B
(Figure 11). We can find these translations in
O
(
δn
) time if we find and sort all the translations.
Alternatively, we can find these translations in
log
n
(
δ
b
δn
O
)timeifwerun
[
δ
b
] quantile operations.
So we have a total of
δ
b
translations and setting
b
=
βn
completes the
proof.
A, B
n, m
m<n
Given sequences
with lengths
respectively (
), and con-
stants
δ, β, ε
, the approximation algorithm works as follows:
(i)
Find all sets of translations for sequences A and B.