Database Reference
In-Depth Information
relatively smaller number of classification errors, because of greedy strategy. In
addition, the reduction of
MPs
can increase the understandability of the classifier.
Therefore, in this sub-step, we identify the first
MP
with the least number of errors in
L
and discard all the MPs after it because these
MPs
produce more errors. The
undiscarded
MPs
and the default class corresponding to the first
MP
with the least
number of errors in
L
form our
De-MP
classifier.
The second step of the
MOUCLAS
algorithm is shown in Figure 2.
In the testing phase, when we classify a new transaction, the first
MP
in
De-MP
satisfying the transaction is used to classify it. In
De-MP
classifier,
default_class
,
having the lowest precedence, is used to specify a default class for any new sample
that is not satisfied by any other
MPs
as in C4.5
7
, CBA
4
.
4 The
MOUCLAS-2
Algorithm
The classification technique,
MOUCLAS-2
, consists of two main processes:
1.
Discovering of all
JMPs
for each class.
2.
Calculating their
subsup
and building a classifier, called
J-MP
, based on
JMPs
.
The core of the
MOUCLAS-2
algorithm is to find all
cluster_rules,
namely the
JMPs
. The
MOUCLAS-2
algorithm works in three sub-steps, by which the problem of
discovering
JMPsets
and construction of a classifier is solved:
Algorithm:
Mining Jumping
MOUCLAS
Patterns (
JMPs
) and building
J-MP
Classifier
Input:
A training transaction database,
D
;
Output:
J-MP
Classifier
Methods:
(1)
Reduce the dimensionality of transactions
d
in each class
y
by the
information of the attributes in corresponding
JEPs
, and
(2)
Identify all the clusters of database based on the Mountain function, which
is a fuzzy set membership function, and specially capable of transforming
quantitative values of attributes in transactions into linguistic terms, and
(3)
Generate
JMPsets
for each class
y
and calculate their
subsup
.
In the first sub-step, detailed method concerning JEP can be found in this paper
6
.
The third sub-step of the
MOUCLAS-2
algorithm form the
cluster_rules
, with any
number of predicates in the antecedent. It brings us a step further towards the solution
of our research challenge. From this set of
cluster_rules
of a class
y
, we produce a set
of
JMPs
for the class
y
.
Let
I
be the set of all items in
D
labeled with class
y,
C
be the dataset of transaction
d
labeled with class
y
after dimensionality reduction processing by a
JEP
, where
transaction
d
I
, a
k-
itemset, and
i
be the number of
JEPs
in the
class
y
. Let E denote a set of
cluster_rules
(
JMPset
) of a class
y
, corresponding to a
JEP, where
e
∈
C
contains
X
i
⊆
∈
E.
Search WWH ::
Custom Search