Database Reference
In-Depth Information
16: if h i = 0 then
17: Find a bit in i ith bloom filter which is set to 1, and reset to 0.
18: Set the bit at h i position to 1
19: end if
20: end for
21: else
22: With probability ( s / iter ) insert e by setting all the bit positions in H .
23: If e was decided to be inserted then randomly reset one bit positions
from each of the k Bloom filters.
24: end if
25: end if
26: iter iter + 1
27: end for
This transition (Event 1) may happen during any of the iterations from ( s + 1) to
m. Hence, l ∈ [ s + 1, m ]. Since the different Bloom filters are independent, the final
decision ( distinct or duplicate ) regarding e m +1 is taken after probing all the bit posi-
tions of H , and hence,
k
s
m
≈−
ks
m
P
=−
1
1
(13.1)
H
It can be observed that the transition of the bit may also be possible during the first
s elements of the stream (Event 2). Since the first s elements of the stream are always
inserted, all the bit positions in H should be set at least once during this period for an
element to be reported as duplicate. Therefore, using similar computations as above,
the probability that all the bits for e m +1 in H is set in the initial s iterations is given by
k
1
s
m
(13.2)
P
=−
1
H
e
s
Either of the above two events will contribute to the FPR, hence the probability of
e m +1 being reported as an FP can be obtained using Equations 13.1 and 13.2, which
is given by
k
m
U
U
1
−+−
ks
m
1
s
m
P
=
1
1
(13.3)
FPR
e
The detailed proof can be found in [15]. Analyzing Equation 13.3, it can be seen
that as the stream length m tends to infinity, the right multiplicative factor tends
to 1. However, as U − 1 < U the left multiplicative term tends to 0. Hence as the
stream length increases, the observed FPR decreases and nearly becomes constant.
This leads to a stable performance of RSBF similar to that of SBF. However, RSBF
achieved this convergence much faster as opposed to SBF as discussed in [15].
Search WWH ::




Custom Search