Searching - Data Structures and Algorithm Analysis

Java Reference

In-Depth Information

9.2 Implement the three self-organizing list heuristics count, move-to-front, and

transpose. Compare the cost for running the three heuristics on various input

data. The cost metric should be the total number of comparisons required

when searching the list. It is important to compare the heuristics using input

data for which self-organizing lists are reasonable, that is, on frequency dis-

tributions that are uneven. One good approach is to read text files. The list

should store individual words in the text file. Begin with an empty list, as

was done for the text compression example of Section 9.2. Each time a word

is encountered in the text file, search for it in the self-organizing list. If the

word is found, reorder the list as appropriate. If the word is not in the list,

add it to the end of the list and then reorder as appropriate.

9.3 Implement the text compression system described in Section 9.2.

9.4 Implement a system for managing document retrieval. Your system should

have the ability to insert (abstract references to) documents into the system,

associate keywords with a given document, and to search for documents with

specified keywords.

9.5 Implement a database stored on disk using bucket hashing. Define records to

be 128 bytes long with a 4-byte key and 120 bytes of data. The remaining

4 bytes are available for you to store necessary information to support the

hash table. A bucket in the hash table will be 1024 bytes long, so each bucket

has space for 8 records. The hash table should consist of 27 buckets (total

space for 216 records with slots indexed by positions 0 to 215) followed by

the overflow bucket at record position 216 in the file. The hash function for

key value K should be K mod 213. (Note that this means the last three

slots in the table will not be home positions for any record.) The collision

resolution function should be linear probing with wrap-around within the

bucket. For example, if a record is hashed to slot 5, the collision resolution

process will attempt to insert the record into the table in the order 5, 6, 7, 0,

1, 2, 3, and finally 4. If a bucket is full, the record should be placed in the

overflow section at the end of the file.

Your hash table should implement the dictionary ADT of Section 4.4. When

you do your testing, assume that the system is meant to store about 100 or so

records at a time.

9.6 Implement the dictionary ADT of Section 4.4 by means of a hash table with

linear probing as the collision resolution policy. You might wish to begin

with the code of Figure 9.9. Using empirical simulation, determine the cost

of insert and delete as grows (i.e., reconstruct the dashed lines of Fig-

ure 9.10). Then, repeat the experiment using quadratic probing and pseudo-

random probing. What can you say about the relative performance of these

three collision resolution policies?

Search WWH ::

Custom Search

Home