Mining MOUCLAS Patterns and Jumping MOUCLAS Patterns to Construct Classifiers - Data Mining: Theory, Methodology, Techniques, and Applications

Database Reference

In-Depth Information

relatively smaller number of classification errors, because of greedy strategy. In

addition, the reduction of MPs can increase the understandability of the classifier.

Therefore, in this sub-step, we identify the first MP with the least number of errors in

L and discard all the MPs after it because these MPs produce more errors. The

undiscarded MPs and the default class corresponding to the first MP with the least

number of errors in L form our De-MP classifier.

The second step of the MOUCLAS algorithm is shown in Figure 2.

In the testing phase, when we classify a new transaction, the first MP in De-MP

satisfying the transaction is used to classify it. In De-MP classifier, default_class ,

having the lowest precedence, is used to specify a default class for any new sample

that is not satisfied by any other MPs as in C4.5 7 , CBA 4 .

4 The MOUCLAS-2 Algorithm

The classification technique, MOUCLAS-2 , consists of two main processes:

1. Discovering of all JMPs for each class.

2. Calculating their subsup and building a classifier, called J-MP , based on JMPs .

The core of the MOUCLAS-2 algorithm is to find all cluster_rules, namely the

JMPs . The MOUCLAS-2 algorithm works in three sub-steps, by which the problem of

discovering JMPsets and construction of a classifier is solved:

Algorithm: Mining Jumping MOUCLAS Patterns ( JMPs ) and building J-MP

Classifier

Input: A training transaction database, D ;

Output: J-MP Classifier

Methods:

(1) Reduce the dimensionality of transactions d in each class y by the

information of the attributes in corresponding JEPs , and

(2) Identify all the clusters of database based on the Mountain function, which

is a fuzzy set membership function, and specially capable of transforming

quantitative values of attributes in transactions into linguistic terms, and

(3) Generate JMPsets for each class y and calculate their subsup .

In the first sub-step, detailed method concerning JEP can be found in this paper 6 .

The third sub-step of the MOUCLAS-2 algorithm form the cluster_rules , with any

number of predicates in the antecedent. It brings us a step further towards the solution

of our research challenge. From this set of cluster_rules of a class y , we produce a set

of JMPs for the class y .

Let I be the set of all items in D labeled with class y, C be the dataset of transaction

d labeled with class y after dimensionality reduction processing by a JEP , where

transaction d

I , a k- itemset, and i be the number of JEPs in the

class y . Let E denote a set of cluster_rules ( JMPset ) of a class y , corresponding to a

JEP, where e

∈

C contains X i

⊆

∈

Data Mining: Theory, Methodology, Techniques, and Applications

Search WWH ::

Custom Search

Home