Database Reference
In-Depth Information
The time complexity of query evaluation in a sliding window using the above
sampling method is O
3ln 2
δ
mn 1
(
φ )
, where m is the number of samples. If m
,(0
<
2
ξ
δ <
1,
ξ >
0), then, the upper bound of approximation error is
φ + ε + ξ
with a
δ
probability at least 1
.
In the above two extended algorithms using approximate quantile summaries, the
space complexity of the algorithms is reduced from O
n log 2
εω
(
n
ω)
to O
(
)
, which is
2
ε
the space complexity of computing
ε
approximate quantiles.
6.4 Experimental Results
In this section, we report a systematic empirical study. All the experiments were
conducted on a PC computer with a 3.0 GHz Pentium 4 CPU, 1
0 GB main memory,
and a 160 GB hard disk, running the Microsoft Windows XP Professional Edition
operating system. Our algorithms were implemented in Microsoft Visual Studio
2005.
.
6.4.1 Results on Real Data Sets
To illustrate the effectiveness of probabilistic threshold top- k queries over sliding
windows in real applications, we use the seismic data collected from the wireless
sensor network monitoring volcanic eruptions 1 . 16 sensors were deployed at Reven-
tador, an active volcano in Ecuador. Each of the 16 sensors continuously sampled
seismic data, and the last 60 seconds of data from each node was examined at the
base station. To detect the eruption, it is interesting to continuously report the top- k
monitoring locations with the highest seismic values in the last 60 seconds.
The seismic data reported by each sensor is treated as an uncertain stream, and
each data record is an instance. We test probabilistic threshold top- k queries with
different parameter values on the data set. Since the results demonstrate the similar
patterns, we only report the answers to a probabilistic threshold top- k query with
k
4 in this topic. We consider a sliding window width of 60 instances
per stream. The answers to the query in 10 consecutive sliding windows are reported
in Table 6.1. As comparison, we also compute the average value of each stream in
each sliding window and report the top-5 streams with the highest average seismic
values.
The answers to the probabilistic threshold top- k query listed in Table 6.1 reveal
the following interesting patterns. First, the seismic values reported by sensors O 2
and O 16 are consistently among the top-5 with high confidence in the 10 sliding
windows. The rankings of seismic values in those locations are stable. Second, the
seismic values reported by sensor O 6 is among the top-5 with high confidence in the
=
5 and p
=
0
.
1 http://fiji.eecs.harvard.edu/Volcano
 
Search WWH ::




Custom Search