Database Reference
In-Depth Information
2.3
Constrained Frequent Pattern Mining
Off-the-shelf frequent pattern mining algorithms discover a large number of patterns
which are not useful when it is desired to determine patterns on the basis of more
refined criteria. Frequent pattern mining methods are often particularly useful in the
context of constrained applications, in which rules satisfying particular criteria are
discovered. For example, one may desire specific items to be present in the rule. One
solution is to first mine all the itemsets, and then enable online mining from this set
of base patterns [ 3 ]. However, pushing constraints directly into the mining process
has several advantages. This is because when constraints are pushed directly into
the mining process, the mining can be performed at much lower support levels than
can be performed by using a two-phase approach. This is especially the case when
a large number of intermediate candidates can be pruned by the constraint-based
pattern mining algorithm.
A variety of arbitrary constraints may also be present in the patterns. The major
problem with such methods is that the constraints may result in the violation of
the downward closure property. Because most frequent pattern mining algorithms
depend crucially on this property, its violation is a serious issue. Nevertheless, many
constraints have specialized properties because of which specialized algorithms can
be developed. Methods for constrained frequent pattern mining method have been
discussed in [ 55 , 57 , 60 ]. Constrained methods have also been developed for the
sequential pattern mining problem [ 31 , 61 ]. In real applications, the output of the
vanilla frequent pattern mining problem may be too large, and it is only by pushing
constraints into the pattern mining process, that useful application-specific patterns
can be found. Constrained frequent pattern mining methods are closely related to
the problem of pattern-based classification, because the latter problem requires us to
discover discriminative patterns from the underlying data. Methods for constrained
frequent pattern mining will be discussed in Chap. 2.
2.4
Compressed Representations of Frequent Patterns
A major problem in frequent pattern mining algorithms is that the volume of the min-
ing patterns is often extremely large. This scenario creates numerous challenges for
using these patterns in a meaningful way. Furthermore, different kinds of redundancy
are present in the mined patterns. For example, maximal patterns imply the presence
of all their subsets in the data. There is some information loss in terms of the exact
support values of these subsets. Therefore, if it is not needed to preserve the values
of the support across the patterns, then the determination of concise representations
can be very useful.
A particularly interesting form of concise representation is that of closed patterns
[ 56 ]. An itemset X is set to be closed if none of its supersets have the same support
as X . Therefore, by determining all the closed frequent patterns, one can derive
not only the exhaustive set of frequent itemsets, but also their supports. Note that
Search WWH ::




Custom Search