Database Reference
In-Depth Information
Fig. 17.1 The process of classifier construction via supervised pattern mining
In this chapter, we will provide an overview of pattern mining techniques that
can be used in such a supervised context. The patterns found by these techniques
can often be interpreted as rules : the conditions of the rule identify examples for
which a certain property in the target attribute holds. The techniques are hence
related to Machine Learning: many traditional Machine Learning algorithms are
rule-based as well. A natural question is how to link these two fields to each other,
in particular given that the focus of both areas is complementary: most traditional
machine learning techniques deal with the large search space of potential rules by
adopting heuristics; pattern mining methods, on the other hand, offer more efficient
methods for traversing a search space exhaustively, promising to find better rules
than those found by traditional rule learners. We will address this as well.
The earliest techniques that integrated both areas mirrored the FIM techniques
closely, using support and confidence to constrain itemsets and rules, and support's
anti-monotonicity to prune the search space. In addition to new challenges, super-
vised pattern mining also offers new opportunities, however, since the supervision
allows to use additional quality measures and prune based on the properties of con-
straints based on these measures. By now, the field has developed far from its origins,
encompassing other representations, incorporating approaches and quality measures
developed in the context of Machine Learning, and paying much attention to pattern
set mining.
The latter topic is not limited to supervised pattern mining but is of particular
importance there: when constructing classifiers, rule lists or sets, but also decision
trees, or non-symbolic classifiers, redundancy among or irrelevance of patterns is
often detrimental to the classifier's performance.
We have given a unifying perspective on pattern-based classification in the past
[ 9 ] in which we focused on two dimensions. The first concerned pattern set mining,
specifically whether techniques performed post-processing , selecting some patterns
out of the result set of a single pattern mining step, or whether they iterated pattern
mining. The second dimension focused on whether they let the pattern mining and
selection process be guided by a particular model or not. While these distinctions
still stand, in our opinion, we have decided to structure this chapter differently,
discussing each of the three steps shown in Fig. 17.1 separately: pattern mining,
pattern set mining, and finally classifier construction, and surveying the different,
sometimes numerous, options available.
Search WWH ::




Custom Search