Database Reference
In-Depth Information
13.5.1.2 False-Negative Rate
A false-negative error occurs in the stream when a duplicate element is recognized
as distinct. This section focuses on determining the probability of occurrence of an
FN. As per the working of RSBF (Algorithm 13.2), an element
e
will be an FN if it
has occurred in the stream earlier and one of the following two cases hold:
1. At least one of the
k
bits of the hash positions of
e
(set during the previous
occurrence of
e
) has been reset during the insertion of another stream ele-
ment into the reservoir.
2. When
e
occurred earlier in the stream it was not inserted due to low inser-
tion probability of the stream then (by reservoir sampling). However,
according to the threshold
p
* in Algorithm 13.2, every distinct element in
the stream is inserted if the current insertion probability,
p
i
≤
p
*. Therefore,
if previous appearances of
e
had occurred before
p
i
was less than
p
* and
were not inserted, then it is likely to be detected as an FN when
e
repeats for
the first time after the insertion probability of the reservoir falls below
p
*.
Consider the probability of occurrence of an FN for an element
e
m
+1
, at the (
m
+ 1)t h
iteration. Let the previous occurrence of element
e
m
+1
be at position
x
, where it was
inserted into the reservoir. Therefore, Pr(
e
m
+1
occurs at
x and
is inserted) =
P
x
= p
x
/
U.
Now, for all iterations from (
x
+ 1)
to
m
, either
e
m
+1
has not occurred in the stream or
was not inserted. Thus, Pr(
e
m
+1
has not occurred OR
e
m
+1
has not been inserted after
x
)
is given by
mx
−
m
−
s
(
mmx
Um
−
)
U
−
11
−
p
≤−
s
Um
s
Um
∏
P
=
+
i
1
≤
e
∴=
i
psi
/and
is small
x
′
U
U
ix
=+
1
Now, P r(
e
m
+1
was last inserted at position
x
)
is given by
−−
sm x
Um
(
)
s
Ux
e
PPP
=
≤
(13.4)
l
x
x
′
Since
e
m+
1
was last inserted at position
x
, the
k
bits corresponding to
e
m+
1
were all
set to 1. Therefore,
e
m
+1
will be a FN if at least one of those
k
bits is reset to 0. Due
to the deletion operation in case of insertion of an element into the reservoir, some of
those
k
bits can be reset again. Let
y
be the last iteration where there is a transition
from 0 to 1 for any of the
k
n = bits corresponding to
e
m
+1
, after which it is not reset
again until the
m
th
iteration, and hence
x
≤
y
≤
m.
Therefore, Pr(a bit is set at
y
) =
P
y
=
p
y
/
s
= 1/
y.
Also, Pr(that bit is not reset after
y
)
is given as
m
+−
1
y
m
∏
1
P
== −
p
(
1
p
)
=
(13.5)
y
′
i
i
s
iy
=+
1
Search WWH ::
Custom Search