Databases Reference
In-Depth Information
! Exercise 6.2.3 : Let there be I items in a market-basket data set of B baskets.
Suppose that every basket contains exactly K items.
As a function of I, B,
and K:
(a) How much space does the triangular-matrix method take to store the
counts of all pairs of items, assuming four bytes per array element?
(b) What is the largest possible number of pairs with a nonzero count?
(c) Under what circumstances can we be certain that the triples method will
use less space than the triangular array?
!! Exercise 6.2.4 : How would you count all itemsets of size 3 by a generalization
of the triangular-matrix method? That is, arrange that in a one-dimensional
array there is exactly one element for each set of three items.
! Exercise 6.2.5 : Suppose the support threshold is 5. Find the maximal fre-
quent itemsets for the data of:
(a) Exercise 6.1.1.
(b) Exercise 6.1.3.
Exercise 6.2.6 : Apply the A-Priori Algorithm with support threshold 5 to
the data of:
(a) Exercise 6.1.1.
(b) Exercise 6.1.3.
! Exercise 6.2.7 : Suppose we have market baskets that satisfy the following
assumptions:
1. The support threshold is 10,000.
2. There are one million items, represented by the integers 0, 1, . . . , 999999.
3. There are N frequent items, that is, items that occur 10,000 times or
more.
4. There are one million pairs that occur 10,000 times or more.
5. There are 2M pairs that occur exactly once. Of these pairs, M consist of
two frequent items; the other M each have at least one nonfrequent item.
6. No other pairs occur at all.
7. Integers are always represented by 4 bytes.
Search WWH ::




Custom Search