Database Reference
In-Depth Information
6.2.7
Exercises for Section 6.2
EXERCISE 6.2.1 If we use a triangular matrix to count pairs, and n , the number of items, is
20, what pair's count is in a [100]?
! EXERCISE 6.2.2 In our description of the triangular-matrix method in Section 6.2.2 , the
formula for k involves dividing an arbitrary integer i by 2. Yet we need to have k always be
an integer. Prove that k will, in fact, be an integer.
! EXERCISE 6.2.3 Let there be I items in a market-basket data set of B baskets. Suppose that
every basket contains exactly K items. As a function of I , B , and K :
(a) How much space does the triangular-matrix method take to store the counts of all pairs
of items, assuming four bytes per array element?
(b) What is the largest possible number of pairs with a nonzero count?
(c) Under what circumstances can we be certain that the triples method will use less space
than the triangular array?
!! EXERCISE 6.2.4 How would you count all itemsets of size 3 by a generalization of the
triangular-matrix method? That is, arrange that in a one-dimensional array there is exactly
one element for each set of three items.
! EXERCISE 6.2.5 Suppose the support threshold is 5. Find the maximal frequent itemsets
for the data of:
(a) Exercise 6.1.1 .
(b) Exercise 6.1.3 .
EXERCISE 6.2.6 Apply the A-Priori Algorithm with support threshold 5 to the data of:
(a) Exercise 6.1.1 .
(b) Exercise 6.1.3 .
! EXERCISE 6.2.7 Suppose we have market baskets that satisfy the following assumptions:
(1) The support threshold is 10,000.
(2) There are one million items, represented by the integers 0 , 1 , . . . , 999999.
(3) There are N frequent items, that is, items that occur 10,000 times or more.
(4) There are one million pairs that occur 10,000 times or more.
Search WWH ::




Custom Search