Database Reference
In-Depth Information
Table 8.1. Active nodes for each prefix of query
“Found” on the trie of Figure 5.1.
F
Fo
Fou
Foun
Found
U, 1
U, 1
Fu, 1
Fu, 1
Fun, 1
Fund, 1
F,1
F,0
Fo,0
Fo,1
C, 1
C, 1
Foo, 1
Foo, 1
Fu, 1
Fe, 1
Fo, 1
Fe, 1
(hence active node
) = 0).
We complete set A 0 by adding all nodes corresponding to strings with
lengths in the interval [1 ]. These strings correspond to performing
from1upto θ arbitrary insertions to the empty string, and are still
within edit distance θ from v 0 .
The key observation now is that active set A i can be computed
directly from set A i− 1 . The recurrence relation can be derived from
the recurrence relation of the dynamic programming algorithm for edit
distance verification (Algorithm 2.1.1). The intuition is that we are
trying to incrementally compute the dynamic programming matrix of
Algorithm 2.1.1 for all strings contained in the trie at once (by virtue
of computing the edit distance between the query string and all active
paths of the trie).
Given query prefix v i , 1 ≤ i ≤|v| , we consider all strings with lengths
within the interval [ i
is associated with edit distance
E
(
,
θ,i + θ ] (i.e., the L 0 -norm filter bounds for
v i ). A 0 corresponds to the first row of the dynamic programming
matrix. Every time we increase i we need to compute a new row in
the matrix. The new row is computed based on the values in the previ-
ous row. Assume that we are computing A i from A i− 1 for prefix v i . The
algorithm proceeds by examining every entry in set A i− 1 and taking
three actions:
(1) It determines whether an active node N should be added in
set A i by increasing its edit distance by one. This corresponds
to a deletion of character σ i (and value d 1 in Algorithm 2.1.1).
(2) It tests whether there exists a child node N of N , s.t.
σ ( N )= σ i . In that case N is added in A i with edit
 
Search WWH ::




Custom Search