Java Reference
In-Depth Information
records by expected frequency of access. While the benefits might not be as great
as when organized by key value, the cost to organize (at least approximately) by
frequency of access can be much cheaper, and thus can speed up sequential search
in some situations.
Assume that we know, for each key k i , the probability p i that the record with
key k i will be requested. Assume also that the list is ordered so that the most
frequently requested record is first, then the next most frequently requested record,
and so on. Search in the list will be done sequentially, beginning with the first
position. Over the course of many searches, the expected number of comparisons
required for one search is
C n = 1p 0 + 2p 1 + ::: + np n1 :
In other words, the cost to access the record in L[0] is 1 (because one key value is
looked at), and the probability of this occurring is p 0 . The cost to access the record
in L[1] is 2 (because we must look at the first and the second records' key values),
with probability p 1 , and so on. For n records, assuming that all searches are for
records that actually exist, the probabilities p 0 through p n1 must sum to one.
Certain probability distributions give easily computed results.
Example9.1 Calculate the expected cost to search a list when each record
has equal chance of being accessed (the classic sequential search through
an unsorted list). Setting p i = 1=n yields
n X
C n =
i=n = (n + 1)=2:
i=1
This result matches our expectation that half the records will be accessed on
average by normal sequential search. If the records truly have equal access
probabilities, then ordering records by frequency yields no benefit. We saw
in Section 9.1 the more general case where we must consider the probability
(labeled p n ) that the search key does not match that for any record in the
array. In that case, in accordance with our general formula, we get
(1p n ) n + 1
2
+p n n = n + 1 np n np n + 2p n
2
= n + 1 + p 0 (n 1)
2
:
Thus, n+1 2 C n n, depending on the value of p 0 .
A geometric probability distribution can yield quite different results.
 
Search WWH ::




Custom Search