Database Reference
In-Depth Information
of frequent closed itemsets, a new visitation scheme of the search space is intro-
duced. Such a visitation scheme results a disjoint sub division of the search space.
This also facilitates parallelism. DCI_CLOSED applies several optimization tricks
to improve execution time, such as the bitwise intersection of tidsets to compute
support and closure. Where possible, it reuses previously computed intersections to
avoid redundant computations.
6
Other Optimizations and Variations
In this section, a number of other optimizations and variations of frequent pattern
mining algorithms will be discussed. Many of these methods are discussed in detail
in other chapters of this topic, and therefore they will be discussed only briefly here.
6.1
Row Enumeration Methods
Not all frequent pattern mining algorithms follow the fundamental steps of baseline
algorithm, there exists a number of special cases, for which specialized frequent
pattern mining algorithms have been designed. An interesting case is that of micro-
array data sets, in which the columns are very long but the number of rows are not
very large. In such cases, a method called row-enumeration is used [ 22 , 23 , 40 , 48 ,
49 ] instead of the usual column enumeration, in which combinations of rows are
examined during the search process. There are two categories of row enumeration
algorithm. One category algorithm perform bottom-up [ 22 , 23 , 48 ] search over
the row enumeration tree whereas other category algorithms perform top-down[ 40 ]
search strategy.
Row enumeration algorithms perform mining over the transpose of the transaction
database. In transpose database, each transaction id become item and each item cor-
responds a transaction. Mining over the transposed database is basically the bottom
up search for frequent patterns by enumeration of row sets. However, the bottom-up
search strategy cannot take advantage of user-specified minimum support threshold
to effectively prune the search space, and therefore leads to longer running time
and large memory overhead. As a solution [ 40 ] introduce a top-down approach of
mining using a novel row enumeration tree. Their approach can take full advantage
of user-defined minimum support value and prune the search space efficiently hence
lower down the execution time.
Note that, both of the search strategies are applied over the transposed transaction
database. Most of developed algorithm using row enumeration technique concentrate
on mining frequent closed itemset (explained in Sect. 5 ). The reason behind this
motivation is that due to the nature of micro-array data there exists a large number
of redundancy among the frequent patterns for a minimum support threshold and
closed patterns are capable of summarizing the whole database. These strategies will
be discussed in detail in Chap. 4, and therefore only a brief discussion is provided
here.
Search WWH ::




Custom Search