Supervised Pattern Mining and Applications to Classification - Frequent Pattern Mining

Database Reference

In-Depth Information

Fig. 17.1 The process of classifier construction via supervised pattern mining

In this chapter, we will provide an overview of pattern mining techniques that

can be used in such a supervised context. The patterns found by these techniques

can often be interpreted as rules : the conditions of the rule identify examples for

which a certain property in the target attribute holds. The techniques are hence

related to Machine Learning: many traditional Machine Learning algorithms are

rule-based as well. A natural question is how to link these two fields to each other,

in particular given that the focus of both areas is complementary: most traditional

machine learning techniques deal with the large search space of potential rules by

adopting heuristics; pattern mining methods, on the other hand, offer more efficient

methods for traversing a search space exhaustively, promising to find better rules

than those found by traditional rule learners. We will address this as well.

The earliest techniques that integrated both areas mirrored the FIM techniques

closely, using support and confidence to constrain itemsets and rules, and support's

anti-monotonicity to prune the search space. In addition to new challenges, super-

vised pattern mining also offers new opportunities, however, since the supervision

allows to use additional quality measures and prune based on the properties of con-

straints based on these measures. By now, the field has developed far from its origins,

encompassing other representations, incorporating approaches and quality measures

developed in the context of Machine Learning, and paying much attention to pattern

set mining.

The latter topic is not limited to supervised pattern mining but is of particular

importance there: when constructing classifiers, rule lists or sets, but also decision

trees, or non-symbolic classifiers, redundancy among or irrelevance of patterns is

often detrimental to the classifier's performance.

We have given a unifying perspective on pattern-based classification in the past

[ 9 ] in which we focused on two dimensions. The first concerned pattern set mining,

specifically whether techniques performed post-processing , selecting some patterns

out of the result set of a single pattern mining step, or whether they iterated pattern

mining. The second dimension focused on whether they let the pattern mining and

selection process be guided by a particular model or not. While these distinctions

still stand, in our opinion, we have decided to structure this chapter differently,

discussing each of the three steps shown in Fig. 17.1 separately: pattern mining,

pattern set mining, and finally classifier construction, and surveying the different,

sometimes numerous, options available.

Search WWH ::

Custom Search

Home