Database Reference
In-Depth Information
5.2.4
GenMax
Like MAFIA, GenMax is a uses the vertical representation to speed up counting.
Specifically the tidlists are used by GenMax to speed up the counting approach.
In particular the more recent notion of diffsets [ 72 ] was used, and a depth-first
exploration strategy was used. An approach known as successive focussing was
used to further improve the efficiency of the algorithm. The details of the GenMax
approach may be found in [ 28 ].
5.3
Frequent Closed Itemset Mining Algorithms
The are several frequent closed itemset mining algorithms [ 41 , 42 , 51 - 53 , 64 ,
66 - 69 , 73 ] exist to date. Most of the maximal and closed pattern mining algorithms
are based on different variations of the non-maximal pattern mining algorithms. Typ-
ically pruning strategies are incorporated within the non-maximal pattern mining
algorithms to yield more efficient algorithms.
5.3.1
Close
In this algorithm [ 52 ] authors apply Apriori based patten generation over the closed
itemset search space. The usages of closed itemset lattice (search space) significantly
reduces the overall search space of the algorithm. Close operates in iterative manner.
Each iteration consists of three phases, . First, the closure function is applied for
obtaining the candidate closed itemsets and their support. Next, the obtained set
of candidate closed itemsets are tested against the minimum support constraint. If
succeed, the candidates are marked as frequent closed itemset. Finally the same
procedure is initiated to generate the next level of candidate closed itemsets. This
process continues until all frequent closed itemsets have been generated.
5.3.2
CHARM
CHARM [ 73 ] is a frequent closed itemset mining algorithm, that takes advantage of
the vertical representation of database as in the case of Eclat [ 71 ] for efficient closure
checking operation. For punning the search space CHARM uses the following three
properties. Suppose for itemset P and Q , if tidset( P ) = tidset( Q ), then it replaces
every occurrence of P by P
Q and prune the whole branch under Q . On the
other hand if tidset( P )
Q ,
but does not prune the branch under Q . Finally if, tidset( P ) <> tidset( Q ), none of
the aforementioned prunings can be applied. The initial call of CHARM accepts
a set( I ) of single length frequent item and minimum support as input. As a first
step, it sorts I by the increasing the order of support of the items. For each item P ,
tidset( Q ), it replaces every occurrence of P by P
Search WWH ::




Custom Search