Database Reference
In-Depth Information
items that is found to be frequent, there are only n possible association rules involving this
set of items, namely J − { j } → j for each j in J . If J is frequent, J − { j } must be at least as
frequent. Thus, it too is a frequent itemset, and we have already computed the support of
both J and J − { j }. Their ratio is the confidence of the rule J − { j } → j .
It must be assumed that there are not too many frequent itemsets and thus not too many
candidates for high-support, high-confidence association rules. The reason is that each one
found must be acted upon. If we give the store manager a million association rules that
meet our thresholds for support and confidence, they cannot even read them, let alone act
on them. Likewise, if we produce a million candidates for biomarkers, we cannot afford
to run the experiments needed to check them out. Thus, it is normal to adjust the support
threshold so that we do not get too many frequent itemsets. This assumption leads, in later
sections, to important consequences about the efficiency of algorithms for finding frequent
itemsets.
6.1.5
Exercises for Section 6.1
EXERCISE 6.1.1 Suppose there are 100 items, numbered 1 to 100, and also 100 baskets, also
numbered 1 to 100. Item i is in basket b if and only if i divides b with no remainder. Thus,
item 1 is in all the baskets, item 2 is in all fifty of the even-numbered baskets, and so on.
Basket 12 consists of items {1 , 2 , 3 , 4 , 6 , 12}, since these are all the integers that divide 12.
Answer the following questions:
(a) If the support threshold is 5, which items are frequent?
! (b) If the support threshold is 5, which pairs of items are frequent?
! (c) What is the sum of the sizes of all the baskets?
! EXERCISE 6.1.2 For the item-basket data of Exercise 6.1.1 , which basket is the largest?
EXERCISE 6.1.3 Suppose there are 100 items, numbered 1 to 100, and also 100 baskets, also
numbered 1 to 100. Item i is in basket b if and only if b divides i with no remainder. For
example, basket 12 consists of items
{12 , 24 , 36 , 48 , 60 , 72 , 84 , 96}
Repeat Exercise 6.1.1 for this data.
! EXERCISE 6.1.4 This question involves data from which nothing interesting can be
learned about frequent itemsets, because there are no sets of items that are correlated. Sup-
pose the items are numbered 1 to 10, and each basket is constructed by including item i
Search WWH ::




Custom Search