Supervised Pattern Mining and Applications to Classification - Frequent Pattern Mining

Database Reference

In-Depth Information

2. Based on all patterns, manipulate the entire data

3. Recur on the new blocks

Notwithstanding this statement, the REMINE algorithm proposed by [ 45 ]sofaris

the only one to proceed in this way to iteratively mine supervised patterns.

3.4

Data Instance-Based Selection

In addition to the partition-based techniques, there is another paradigm, which selects

patterns based on individual instances. The Harmony algorithm retains for each

training instance the highest-confidence rule, as does CCCS, whereas the technique

described by [ 28 ], called Large Bayes (LB), selects patterns based on the instances

whose labels are to be predicted. This is similar to DEEP, described by [ 26 ], and

LAC, proposed by [ 37 ], which only generate patterns that match the instances to be

predicted by projecting the data on the items contained in the unlabeled instance.

4

Classifier Construction

After supervised patterns have been mined, and suitable subsets have been selected,

the remaining question is how to employ them for predictive purposes. The solutions

that have been found fall into two main categories: (1) direct use of patterns as rules to

predict the label of an unseen class—the techniques following this paradigm borrow

heavily from rule learning approaches in machine learning, or (2) indirect use of

patterns in a model; here patterns are typically treated as features that are used in

well-established machine learning methods.

4.1

Direct Classification

There are two main methods in rule learning when it comes to making predictions.

In decision lists, rules are ordered according to some criterion and the first rule

that matches the unseen instance makes the prediction. For such classifiers to work

requires rules with high accuracy that at the same time do not overfit the training data.

This means that certain approaches to optimizing quality measures will work better

than others: given that maximizing information gain or χ 2 trades off correlation

with effect size, maximizing confidence or WRACC will be more suitable for such

classifiers. CBA follows this first approach, ordering the rule list by confidence

(descending), support (descending) and length (ascending), as does LAC, ordering

by information gain (descending).

The second method consists of various voting mechanisms that collect all rules

that match the unseen instance and has each class “gather votes” from them. This

Search WWH ::

Custom Search

Home