Database Reference
In-Depth Information
relatively smaller number of classification errors, because of greedy strategy. In
addition, the reduction of MPs can increase the understandability of the classifier.
Therefore, in this sub-step, we identify the first MP with the least number of errors in
L and discard all the MPs after it because these MPs produce more errors. The
undiscarded MPs and the default class corresponding to the first MP with the least
number of errors in L form our De-MP classifier.
The second step of the MOUCLAS algorithm is shown in Figure 2.
In the testing phase, when we classify a new transaction, the first MP in De-MP
satisfying the transaction is used to classify it. In De-MP classifier, default_class ,
having the lowest precedence, is used to specify a default class for any new sample
that is not satisfied by any other MPs as in C4.5 7 , CBA 4 .
4 The MOUCLAS-2 Algorithm
The classification technique, MOUCLAS-2 , consists of two main processes:
1. Discovering of all JMPs for each class.
2. Calculating their subsup and building a classifier, called J-MP , based on JMPs .
The core of the MOUCLAS-2 algorithm is to find all cluster_rules, namely the
JMPs . The MOUCLAS-2 algorithm works in three sub-steps, by which the problem of
discovering JMPsets and construction of a classifier is solved:
Algorithm: Mining Jumping MOUCLAS Patterns ( JMPs ) and building J-MP
Classifier
Input: A training transaction database, D ;
Output: J-MP Classifier
Methods:
(1) Reduce the dimensionality of transactions d in each class y by the
information of the attributes in corresponding JEPs , and
(2) Identify all the clusters of database based on the Mountain function, which
is a fuzzy set membership function, and specially capable of transforming
quantitative values of attributes in transactions into linguistic terms, and
(3) Generate JMPsets for each class y and calculate their subsup .
In the first sub-step, detailed method concerning JEP can be found in this paper 6 .
The third sub-step of the MOUCLAS-2 algorithm form the cluster_rules , with any
number of predicates in the antecedent. It brings us a step further towards the solution
of our research challenge. From this set of cluster_rules of a class y , we produce a set
of JMPs for the class y .
Let I be the set of all items in D labeled with class y, C be the dataset of transaction
d labeled with class y after dimensionality reduction processing by a JEP , where
transaction d
I , a k- itemset, and i be the number of JEPs in the
class y . Let E denote a set of cluster_rules ( JMPset ) of a class y , corresponding to a
JEP, where e
C contains X i
E.
Search WWH ::




Custom Search