Databases Reference
In-Depth Information
Let us consider how many functions we need to get a reasonable probability
of catching a match, without having to compare the fingerprint on the gun with
each of the millions of fingerprints in the database. First, the probability that
two fingerprints from different fingers would be in the bucket for a function f
in F is (0.2) 6 = 0.000064. The reason is that they will both go into the bucket
only if they each have a minutia in each of the three grid points associated with
f , and the probability of each of those six independent events is 0.2.
Now, consider the probability that two fingerprints from the same finger
wind up in the bucket for f . The probability that the first fingerprint has
minutiae in each of the three squares belonging to f is (0.2) 3 = 0.008. However,
if it does, then the probability is (0.8) 3 = 0.512 that the other fingerprint
will as well. Thus, if the fingerprints are from the same finger, there is a
0.008×0.512 = 0.004096 probability that they will both be in the bucket of f .
That is not much; it is about one in 200. However, if we use many functions
from F, but not too many, then we can get a good probability of matching
fingerprints from the same finger while not having too many false positives -
fingerprints that must be considered but do not match.
Example 3.22 : For a specific example, let us suppose that we use 1024
functions chosen randomly from F. Next, we shall construct a new fam-
ily F 1 by performing a 1024-way OR on F. Then the probability that F 1
will put fingerprints from the same finger together in at least one bucket is
1−(1−0.004096) 1024 = 0.985. On the other hand, the probability that
two fingerprints from different fingers will be placed in the same bucket is
(1−(1−0.000064) 1024
= 0.063.
That is, we get about 1.5% false negatives
and about 6.3% false positives.
2
The result of Example 3.22 is not the best we can do. While it offers only a
1.5% chance that we shall fail to identify the fingerprint on the gun, it does force
us to look at 6.3% of the entire database. Increasing the number of functions
from F will increase the number of false positives, with only a small benefit
of reducing the number of false negatives below 1.5%. On the other hand, we
can also use the AND construction, and in so doing, we can greatly reduce
the probability of a false positive, while making only a small increase in the
false-negative rate. For instance, we could take 2048 functions from F in two
groups of 1024. Construct the buckets for each of the functions. However, given
a fingerprint P on the gun:
1. Find the buckets from the first group in which P belongs, and take the
union of these buckets.
2. Do the same for the second group.
3. Take the intersection of the two unions.
4. Compare P only with those fingerprints in the intersection.
Search WWH ::




Custom Search