Database Reference
In-Depth Information
certain value or, in other words, if P ( A B )
P ( A ) is greater than a constant d .The
former is just a test of statistical independence.
The most well-known algorithms for mining association rules are the
Apriori algorithm and the DHP algorithm. The main idea is to generate,
in the i th iteration, the set of candidate itemsets of length i (denoted C i )
and prune the ones that do not satisfy the minimum required support. In
this way, we generate the sets called large itemsets L i , which will be used
for finding the sets of candidate itemsets with length i + 1. To prune these
sets, the Apriori principle is typically applied: if an itemset is frequent, all of
its subsets must also be frequent. For instance,
cannot be a frequent
itemset if either A or B are not frequent. The DHP algorithm is similar to
Apriori, but it uses hashing to test the eligibility of a k -itemset.
Let us show how the Apriori algorithm works, using the example intro-
duced above. We first rewrite the transaction table as a table with two
columns Item and Count ,wherecolumn Count contains the number of times
that an item appears in a transaction in the database.
{A, B}
Item
Count
1
3
2
2
3
2
4
1
5
1
6
1
Assume that the minimum support required is minsup = 50%, which
means each itemset must appear at least two times in the database of
transactions, in order to be considered frequent. Initially, every item is a
candidate 1-itemset C 1 . However, only items 1, 2, and 3 have support at
least equal to minsup . Thus, we delete the remaining ones (depicted in light
gray in the table above) and we obtain the set of large 1-itemsets L 1 . With
this set we generate the new candidate itemset table C 2 , depicted below.
Item
Count
{ 1,2 }
1
{ 1,3 }
2
{ 2,3 }
1
Then, the only 2-itemset that satisfies minsup is
. Since we cannot
generate 3-itemsets (because the set would contain the subset
{
1 , 3
}
,which
does not satisfy the minimum support), we stop here. Then, the only two
rules that can be generated are 1
{
1 , 2
}
3 and 3
1 .
 
Search WWH ::




Custom Search