Database Reference
In-Depth Information
cess may ripple through the bucket sizes, but there are at most log 2 N different sizes, and
the combination of two adjacent buckets of the same size only requires constant time. As a
result, any new bit can be processed in O (log N ) time.
EXAMPLE 4.13 Suppose we start with the buckets of Fig. 4.2 and a 1 enters. First, the left-
most bucket evidently has not fallen out of the window, so we do not drop any buckets. We
create a new bucket of size 1 with the current timestamp, say t . There are now three buckets
of size 1, so we combine the leftmost two. They are replaced with a single bucket of size
2. Its timestamp is t − 2, the timestamp of the bucket on the right (i.e., the rightmost bucket
that actually appears in Fig. 4.2 .
There are now two buckets of size 2, but that is allowed by the DGIM rules. Thus, the
final sequence of buckets after the addition of the 1 is as shown in Fig. 4.3 .
Figure 4.3 Modified buckets after a new 1 arrives in the stream
4.6.6
Reducing the Error
Instead of allowing either one or two of each size bucket, suppose we allow either r − 1 or
r of each of the exponentially growing sizes 1 , 2 , 4 , . . . , for some integer r > 2. In order to
represent any possible number of 1s, we must relax this condition for the buckets of size 1
and buckets of the largest size present; there may be any number, from 1 to r , of buckets of
these sizes.
The rule for combining buckets is essentially the same as in Section 4.6.5 . If we get r +
1 buckets of size 2 j , combine the leftmost two into a bucket of size 2 j +1 . That may, in turn,
cause there to be r + 1 buckets of size 2 j +1 , and if so we continue combining buckets of
larger sizes.
The argument used in Section 4.6.4 can also be used here. However, because there are
more buckets of smaller sizes, we can get a stronger bound on the error. We saw there that
the largest relative error occurs when only one 1 from the leftmost bucket b is within the
query range, and we therefore overestimate the true count. Suppose bucket b is of size 2 j .
Then the true count is at least 1 + ( r − 1)(2 j− 1 + 2 j− 2 + · · · + 1) = 1 + ( r − 1)(2 j − 1). The
overestimate is 2 j− 1 − 1. Thus, the fractional error is
Bucket Sizes and Ripple-Carry Adders
Search WWH ::




Custom Search