Databases Reference
In-Depth Information
Algorithm: Apriori. Find frequent itemsets using an iterative level-wise approach based
on candidate generation.
Input:
D , a database of transactions;
min sup , the minimum support count threshold.
Output: L , frequent itemsets in D .
Method:
(1)
L 1 = find frequent 1-itemsets(D);
(2)
for ( k D 2; L k 1 6D
; k CC) f
C k = apriorigen ( L k 1 );
(3)
(4)
for each transaction t 2 D f // scan D for counts
(5)
C t = subset
.
C k , t
/
; // get the subsets of t that are candidates
(6)
for each candidate c 2 C t
(7)
c.countCC;
(8)
g
(9)
L k Df c 2 C k j c . count min sup g
(10) g
(11) return L D[ k L k ;
procedureapriorigen ( L k 1 :frequent . k 1/-itemsets)
(1) for each itemset l 1 2 L k 1
(2) for each itemset l 2 2 L k 1
(3) if ( l 1 [1] D l 2 [1]/^. l 1 [2] D l 2 [2]/
^...^. l 1 [ k 2] D l 2 [ k 2]/^. l 1 [ k 1] < l 2 [ k 1]/ then f
(4) c D l 1 1 l 2 ; // join step: generate candidates
(5) if hasinfrequentsubset ( c , L k 1 ) then
(6) delete c ; // prune step: remove unfruitful candidate
(7) else add c to C k ;
(8) g
(9) return C k ;
procedurehasinfrequentsubset ( c : candidate k -itemset;
L k 1 : frequent
. k 1
/
-itemsets); // use prior knowledge
(1)
-subset s of c
(2) if s 62 L k 1 then
(3) return TRUE;
(4)
for each .
k 1
/
return FALSE;
Figure 6.4 Apriori algorithm for discovering frequent itemsets for mining Boolean association rules.
A procedure can then be called to generate association rules from the frequent itemsets.
Such a procedure is described in Section 6.2.2.
The apriorigen procedure performs two kinds of actions, namely, join and prune , as
described before. In the join component, L k 1 is joined with L k 1 to generate potential
candidates (steps 1-4). The prune component (steps 5-7) employs the Apriori property
to remove candidates that have a subset that is not frequent. The test for infrequent
subsets is shown in procedure hasinfrequentsubset .
 
Search WWH ::




Custom Search