Database Reference
In-Depth Information
6.4.4 Scalability
To test the scalability of the algorithms, we first fix the sliding window width to
200, and change the number of uncertain streams from 100 to 500. The maximum
number of instances in a window is 100
000. As the number of streams increases,
the Poisson binomial recurrence takes more time in the deterministic method. Thus,
the runtime increases. However, all methods are linearly scalable. The results are
shown in Figure 6.7(a).
Then, we fix the number of streams to 100, and vary the sliding window width
from 200 to 1
,
000. The runtime in the deterministic method increases substantially
faster than the other methods. The sampling method is more stable, because its run-
time is related to only the sample size and the number of streams. For the methods
using quantile summaries, after compressing the instances in to a quantile summary,
the increase of sliding window width does not affect the runtime noticeably. The re-
sults are shown in Figure 6.7(b).
In terms of memory usage, Figures 6.7(c) and 6.7(d) show the scalability of each
algorithm with respect to the number of uncertain streams and the sliding window
width, respectively. The memory used by the deterministic exact algorithm and the
sampling algorithm increases linearly, since it is proportional to the number of in-
stances in a sliding window. The memory used by the extended deterministic method
using quantiles and the sampling method using quantiles does not change dramat-
ically, because the number of instances for each object in the sliding window only
depends on parameter
,
φ
.
6.5 Summary
In this chapter, we proposed a novel uncertain data stream model and continuous
probabilistic threshold top- k queries on uncertain streams, which are different from
the existing probabilistic stream models and queries. A deterministic method and a
sampling method, as well as their space efficient versions using quantiles were de-
veloped to answer those queries. Experimental results on real data sets and synthetic
data sets were reported, which show the effectiveness and efficiency of the methods.
Search WWH ::




Custom Search