Database Reference
In-Depth Information
algorithm. The authors of that work brought this development to its logical conclu-
sion by introducing the concept of “witnesses” [ 17 ], itemsets on which constraint
satisfaction is tested to derive pruning information for parts of the search space.
Witness-based mining subsumes mining under (anti-)monotonic and convertible con-
straints, and is capable of handling additional constraint classes, and mining under
conjunctions of constraints. The D-Miner system combines closed itemset mining
(formal concept analysis) with constraints [ 2 , 3 ].
The Constraint Programming for Itemset Mining framework [ 13 ] is built on the
observation that constraint-based search, and constraint programming in particular,
has been studied extensively in the general artificial intelligence literature. It shows
that the mining tasks discussed earlier can be reformalized in terms of constraints
present in generic constraint programming systems; furthermore, such systems pro-
vide a generic framework for constraint propagation which makes it easy to combine
different constraints.
4.4
Implementation Considerations
In the above description, we intentionally left unaddressed how the indicated prop-
agation is performed in detail. In principle, all different data structures that have
been studied in the frequent itemset mining literature can be used in this context as
well. For instance, the MaxMiner and DualMiner algorithms use vertical representa-
tions of the data most similar to that of Eclat; the FP-Bonsai algorithm, on the other
hand, uses FP-Trees [ 5 ]. The impact of data structures in a Constraint Programming
framework was studied by Nijssen et al [ 24 ]. These studies confirm that for good
run times the choice of data structure is important; however, many of the above
propagation procedures can be adapted to different data representations, and hence
the two aspects can be considered orthogonal.
5
Languages
Most of the systems studied earlier require a language for the specification of
constraints. Roughly speaking, three categories can be distinguished within these
languages: special purpose languages, SQL inspired languages, and constraint
programming based languages.
Special Purpose Languages Many constraint-based mining systems implement a
small special purpose language. As an example, this is an expression in the language
underlying the SeqLog system [ 19 ]:
Search WWH ::




Custom Search