Java Reference
In-Depth Information
for some positive constant g . This expression is a polynomial in g . To minimize the number of
arithmetic operations, write the polynomial in the following algebraically equivalent form:
(…(( u 0 g + u 1 ) g + u 2 ) g + + u n - 2 ) g + u n - 1
This way of evaluating a polynomial is called Horner's method.
The following Java statements perform this computation for the string s and the int constant g :
int hash = 0;
int n = s.length();
for ( int i = 0; i < n; i++)
hash = g * hash + s.charAt(i);
The i th character of the string is s.charAt(i) . Adding this character to the product g * hash actu-
ally adds the character's Unicode value. An explicit cast of s.charAt(i) to int is not necessary
and would not affect the result.
This computation can cause an overflow, particularly for long strings. Java ignores these
overflows and, for an appropriate choice of g , the result will be a reasonable hash code. Current
implementations of the method hashCode in Java's class String use this computation with 31 as
the value of g . Realize, however, that the overflows can produce a negative result. You can deal
with that when you compress the hash code into an appropriate index for the hash table.
Question 1 Calculate the hash code for the string Java when g is 31. Compare your result
with the value of the expression " Java " .hashCode() .
21.9
The hash code for a primitive type. This segment contains Java operations that might be unfamil-
iar to you. However, they are not essential to the rest of this chapter.
If the search key's data type is int , you can use the key itself as the hash code. If the search
key is an instance of either byte , short , or char , you can cast it to an int to get a hash code.
Thus, casting to an int is one way to generate a hash code.
For other primitive types, you manipulate their internal binary representations. If the search
key is an integer of type long , it contains 64 bits. An int has 32 bits. Simply casting the 64-bit
search key to an int —or performing a modulo 2 32 —would lose its first 32 bits. As a result, all
keys that differ in only their first 32 bits will have the same hash code and collide. For this rea-
son, ignoring part of a search key can be a problem.
Note: Derive the hash code from the entire search key. Do not ignore part of it.
Instead of ignoring a part of a long search key, divide it into several pieces. Then combine
the pieces by using either addition or a bit-wise boolean operation such as exclusive or . This pro-
cess is called folding .
For example, let's divide a long search key into two 32-bit halves. To get the left half, we
can shift the search key to the right by a certain number of bits, or places. For example, if we
shift the 8-bit binary number 10101100 to the right by 4 bits, we will get 00001010. We have iso-
lated the number's left half and discarded its right half. If we now combine 00001010 with the
original value and ignore the left half of the result, we will effectively have combined the left and
right halves of the original key.
 
Search WWH ::




Custom Search