Database Reference
In-Depth Information
described in Algorithm 7.1. Reuse denotes the algorithm with the reusing technique
and Reuse+pruning denotes the algorithm with both the reusing technique and the
pruning technique.
Figures 7.12(a) to (d) show the runtime of the query answering methods for
count queries in various parameter settings. Exact-count is the exact algorithm.
Equi-width and Equi-depth denotes the algorithms using the approximation tech-
niques discussed in Sections 7.6.2.1 and 7.6.2.2, respectively. By default,
10 4
ε =
ρ =
and
30.
In Figure 7.12(a), we fix the number of tuples to 5
000 and the expected degree
of each tuple is 4. The runtime of all three algorithms increase mildly as the num-
ber of linkage increases from 20
,
,
000 to 100
,
000, thanks to the vertex compression
technique discussed in Section 7.3.3.
In Figure 7.12(b), we fix the expected degree of each tuple to 4 and increase
the number of tuples in the data set. The runtime of the exact-count algorithm in-
creases quadratically. The runtime of the two approximation algorithms increases
very slightly.
Figure 7.12(c) shows the increase of runtime with respect to the degrees of tuples.
The larger the degree, the more times of convolution are performed in the query
evaluation. Therefore, the runtime of all three algorithms increases linearly with
respect to the expected degree.
Last, Figure 7.12(d) illustrates the change of runtime when the number of con-
nected components increases.
Figures 7.13(a) to (d) show the runtime of the sum query evaluation in the same
setting as in Figure 7.12. The results on the sum query evaluation demonstrate the
similar patterns as in the count query evaluation. But the overall cost for the sum
query evaluation is higher than that of the count query evaluation, since it depends
on the number of values appearing in the result distribution.
Figures 7.14(a) to (d) show the efficiency of the min query evaluation. Exact-
min denotes the algorithm that transforms the min query to a set of count queries.
Reuse denotes the algorithm that explores the sharing of computation among differ-
ent linkages. Reuse+Pruning denotes the algorithm that applies the pruning tech-
nique discussed in Section7.5.3 in addition to the reuse method. It is shown that the
two techniques improve the efficiency significantly.
Last, we evaluate the quality of the two approximation algorithms discussed in
Sections 7.6.2.1 and 7.6.2.2, respectively.
The quality of
ε
-approximate euqi-width histogram answers are computed as
| f (φ ) f (φ ) |
f (φ )
1
|{ φ | f (φ ) > τ }| f (φ ) > τ
1
, where f
(φ )
is the probability of interval
φ
and
f
(φ )
φ
estimated by the approximation method. The results
are shown in Figure 7.15(a), with respect to different
is the probability of x
values 10 x ( x
ε
=
,
,
,
,
3
4
5
6
7).
The precision and recall are also plotted.
Figure 7.15(b) illustrates the approximation quality of the answers to both
count and sum queries using the
ρ
quantiles. The quality is measured using
| f ( v i ) f ( v i ) |
f
1
k
i
1 1
, where v i is the value output as the approximation of the i -
=
(
v i )
 
Search WWH ::




Custom Search