Databases Reference
In-Depth Information
6.13 Give a short example to show that items in a strong association rule actually may
be negatively correlated .
6.14 The following contingency table summarizes supermarket tr ansactio n data, where
hot dogs refers to the transactions containing hot dogs, hot dogs refers to the
transactions that do not cont ain hot dog s, hamburgers refers to the transactions
containing hamburgers, and hamburgers refers to the transactions that do not
contain hamburgers.
hot dogs
hot dogs
6 row
hamburgers
2000
500
2500
hamburgers
1000
1500
2500
6 col
3000
2000
5000
(a) Suppose that the association rule “ hot dogs ) hamburgers ” is mined. Given a
minimum support threshold of 25% and a minimum confidence threshold of
50%, is this association rule strong?
(b) Based on the given data, is the purchase of hot dogs independent of the purchase
of hamburgers ? If not, what kind of correlation relationship exists between the
two?
(c) Compare the use of the all confidence , max confidence , Kulczynski , and cosine
measures with lift and correlation on the given data.
6.15 ( Implementation project ) The DBLP data set (www.informatik.uni-trier
.de/ ley/db/) consists of over one million entries of research papers pub-
lished in computer science conferences and journals. Among these entries, there
are a good number of authors that have coauthor relationships.
(a) Propose a method to efficiently mine a set of coauthor relationships that are
closely correlated (e.g., often coauthoring papers together).
(b) Based on the mining results and the pattern evaluation measures discussed in
this chapter, discuss which measure may convincingly uncover close collabora-
tion patterns better than others.
(c) Based on the study in (a), develop a method that can roughly predict advi-
sor and advisee relationships and the approximate period for such advisory
supervision.
6.6 Bibliographic Notes
Association rule mining was first proposed by Agrawal, Imielinski, and Swami [AIS93].
The Apriori algorithm discussed in Section 6.2.1 for frequent itemset mining was pre-
sented in Agrawal and Srikant [AS94b]. A variation of the algorithm using a similar
pruning heuristic was developed independently by Mannila, Tiovonen, and Verkamo
 
Search WWH ::




Custom Search