Database Reference
In-Depth Information
One of the frequent itemset generation strategies is to reduce number of candidates
by pruning (using Apriori algorithm).
The Apriori algorithm
Apriori is one of the ancient and the most commonly used algorithms for association
rules. Apriori algorithm uses the notion of frequent itemset.
For example, if we define L as an itemset ( L = {Bread, Jam} ), we define our
support to be 50 percent ( s = 50% ).
If 50 percent of the transactions have the itemset L , we say L is a frequent itemset.
It is apparent that if 50 percent of itemsets have {Bread, Jam} in them, at least 50
percent of the transactions will have either {Bread} or {Jam} in them.
Apriori algorithm principle is that a subset of frequent itemset also is frequent.
In Apriori approach, we often start bottom-up, we start with all the frequent itemsets
of size 1 (for example, Bread, Jam, Milk, and so on) first and determine the support.
Then we start pairing them. We find the support for, say {Bread, Jam} or {Jam,
Milk} or {Milk, Bread} .
The following figure shows an illustration of the pruning done as a result of an Apriori
algorithm:
Search WWH ::




Custom Search