Database Reference
In-Depth Information
certain value or, in other words, if
P
(
A
∩
B
)
P
(
A
)
is greater than a constant
d
.The
former is just a test of statistical independence.
The most well-known algorithms for mining association rules are the
Apriori algorithm and the DHP algorithm. The main idea is to generate,
in the
i
th iteration, the set of candidate itemsets of length
i
(denoted
C
i
)
and prune the ones that do not satisfy the minimum required support. In
this way, we generate the sets called large itemsets
L
i
, which will be used
for finding the sets of candidate itemsets with length
i
+ 1. To prune these
sets, the Apriori principle is typically applied: if an itemset is frequent, all of
its subsets must also be frequent. For instance,
cannot be a frequent
itemset if either
A
or
B
are not frequent. The DHP algorithm is similar to
Apriori, but it uses hashing to test the eligibility of a
k
-itemset.
Let us show how the Apriori algorithm works, using the example intro-
duced above. We first rewrite the transaction table as a table with two
columns
Item
and
Count
,wherecolumn
Count
contains the number of times
that an item appears in a transaction in the database.
{A, B}
Item
Count
1
3
2
2
3
2
4
1
5
1
6
1
Assume that the minimum support required is
minsup
= 50%, which
means each itemset must appear at least two times in the database of
transactions, in order to be considered frequent. Initially, every item is a
candidate 1-itemset
C
1
. However, only items 1, 2, and 3 have support at
least equal to
minsup
. Thus, we delete the remaining ones (depicted in light
gray in the table above) and we obtain the set of large 1-itemsets
L
1
. With
this set we generate the new candidate itemset table
C
2
, depicted below.
Item
Count
{
1,2
}
1
{
1,3
}
2
{
2,3
}
1
Then, the only 2-itemset that satisfies
minsup
is
. Since we cannot
generate 3-itemsets (because the set would contain the subset
{
1
,
3
}
,which
does not satisfy the minimum support), we stop here. Then, the only two
rules that can be generated are
1
{
1
,
2
}
⇒
3
and
3
⇒
1
.
Search WWH ::
Custom Search