Database Reference
In-Depth Information
unseen candidate is
L
(
λ
i
)
∈L
a
W
(
λ
i
)
max(min
L
(
λ
i
)
∈L
a
f
=
N
1
)
.
f
i
1
,
v
f
<θ
is a sucient stopping condition that leads to no
false dismissals.
Hence,
N
f
exists. This can be
achieved by examining the
L
1
-norm of all frontier elements simultane-
ously and is based on the observation that
It is easy to see that a tighter bound for
N
Observation 6.1.
Given a string
s
with
L
1
-norm
s
1
and the
L
1
-norms of all frontier elements
f
i
1
, we can immediately deduce
whether
s
potentially appears in list
L
(
λ
i
) or not by a simple com-
parison: If
f
i
1
and
s
has not been encountered in list
L
(
λ
i
)
yet, then
s
does not appear in
L
(
λ
i
) (given that lists are sorted in
increasing order of
L
1
-norms).
s
1
<
Let
L
(
λ
j
) be a list s.t.
f
j
1
= min
L
(
λ
i
)
∈L
a
f
i
1
. Based on Obser-
vation 6.1, a conceptual string
s
with
L
1
-norm
1
that has
not been encountered yet, can appear only in list
L
(
λ
j
). Thus
s
1
=
f
j
Lemma 6.3.
Let
L
a
⊆
L
v
be the set of active lists. The terminating
condition
L
(
λ
j
)
∈
L
a
:
f
j
1
≤
f
i
1
W
(
λ
j
)
max(
f
=
N
max
L
(
λ
i
)
∈L
a
<θ,
f
i
1
,
v
1
)
does not lead to any false dismissals.
Proof.
The proof is based on the proof of Lemma 6.2 and a simple
enumeration argument with at most
m
possible cases. Without loss of
generality, let
L
a
=
{L
(
λ
1
)
,...,L
(
λ
v
m
)
}
be the set of active lists and
f
<θ
is true, we have the following
f
1
1
< ... <
f
m
1
. Then, if
N
possible cases:
1. An unseen string
s
appears only in
L
(
λ
1
). By construction
s
≥
f
1
⇒N
(
s, v
)
≤N
(
f
1
,v
)
⇒N
(
s, v
)
<θ
.
1
1