Database Reference
In-Depth Information
tions of the rows, but a true permutation is only possible because the number of rows, 5, is
a prime. In general, there will be collisions, where two rows get the same hash value.
Figure 3.4 Hash functions computed for the matrix of Fig. 3.2
Now, let us simulate the algorithm for computing the signature matrix. Initially, this mat-
rix consists of all ∞'s:
First, we consider row 0 of Fig. 3.4 . We see that the values of h 1 (0) and h 2 (0) are both 1.
The row numbered 0 has 1's in the columns for sets S 1 and S 4 , so only these columns of the
signature matrix can change. As 1 is less than ∞, we do in fact change both values in the
columns for S 1 and S 4 . The current estimate of the signature matrix is thus:
Now, we move to the row numbered 1 in Fig. 3.4 . This row has 1 only in S 3 , and its hash
values are h 1 (1) = 2 and h 2 (1) = 4. Thus, we set SIG(1 , 3) to 2 and SIG(2 , 3) to 4. All other
signature entries remain as they are because their columns have 0 in the row numbered 1.
The new signature matrix:
The row of Fig. 3.4 numbered 2 has 1's in the columns for S 2 and S 4 , and its hash values
are h 1 (2) = 3 and h 2 (2) = 2. We could change the values in the signature for S 4 , but the
values in this column of the signature matrix, [1 , 1], are each less than the corresponding
hash values [3 , 2]. However, since the column for S 2 still has ∞'s, we replace it by [3 , 2],
resulting in:
Next comes the row numbered 3 in Fig. 3.4 . Here, all columns but S 2 have 1, and the
hash values are h 1 (3) = 4 and h 2 (3) = 0. The value 4 for h 1 exceeds what is already in the
signature matrix for all the columns, so we shall not change any values in the first row of
the signature matrix. However, the value 0 for h 2 is less than what is already present, so we
lower SIG(2 , 1), SIG(2 , 3) and SIG(2 , 4) to 0. Note that we cannot lower SIG(2 , 2) because
Search WWH ::




Custom Search