Database Reference
In-Depth Information
Algorithm 8.3.3:
RangeQuery(
v,N,θ,s
l
,s
u
)
if
N
is the leaf node
for each
s
∈
N
if
Edit Distance Verification
(
v, s, θ
)
do
do
Include
s
in query result
else
if
LB
(
v,
[
s
l
,s
1
])
≤ θ
do
RangeQuery
(
v,N
1
,θ,s
l
,s
1
)
for
j
=2
to
m
do
if
LB
(
v,
[
s
j−
1
,s
j
])
do
≤
θ
do
RangeQuery
(
v,N
j
,θ,s
j−
1
,s
j
)
if
LB
(
v,
[
s
m
,s
u
])
≤ θ
do
RangeQuery
(
v,N
m
+1
,s
m
,s
u
)
string
v
if the lower bounding property holds. The eciency of the algo-
rithm depends on the tightness of the lower bound. We discuss concrete
string orders that satisfy Property 8.2 in the next section.
If a string order
φ
supports range queries, it also directly supports
top-
k
selection queries on the B-tree structure. We simply use a min-
heap to keep the current top-
k
similar strings and update the threshold
θ
with the distance value of the top element in the heap. The detailed
algorithm is shown in Algorithm 8.3.4.
The standard all-match join algorithm on B-trees for traditional
one-dimensional data discovers all node pairs
on the same
level of the tree, with non-empty value overlap on their ranges.
Expansions are conducted by adding join candidates after testing every
pair of children drawn from
N
1
and
N
2
, respectively. For string join
queries with threshold
θ
the standard algorithm is applicable if the
following property holds:
{
N
1
,N
2
}
Property 8.3(Pairwise Lower Bounding).
Given two string inter-
vals [
s
l
,s
u
] and [
r
l
,r
u
], the string order
φ
is pairwise lower bounding if
it returns the lower bound on the distance between any
s
[
s
l
,s
u
] and
∈
[
r
l
,r
u
].
any
r
∈