Database Reference
In-Depth Information
In the first sub-step, the MPs are sorted in descending order, which has the training
transactions surely covered by the MPs with the highest precedence when possible in
the next sub-step. The sort of the whole set of MPs is performed following the
definition of precedence :
Given two MPs , we say that MP a has a higher precedence than MP b , denoted as
MP a
MP b ,
if
MPs , it holds that: the confidence of MP a is greater than that of
MP b , or if their confidences are the same, but the support of MP a is greater than that
of MP b , or if both the confidences and supports of MP a and MP b are the same, but
MP a is generated earlier than MP b .
In the second sub-step, we test the MPs following decreasing precedence and stop
the sub-step when there is no rule or no training transaction. For each MP , we scan C
to find those transactions satisfying the cluset of the MP . If the MP can correctly
classify one transaction, we store it in a set denoted as L . Those transactions satisfying
the cluset of the MP will be removed from C at each pass. Each transaction can be
identified by a unique ID. The next pass will be performed on the remaining data. A
default class is defined at each scan, which is the majority class in the remaining data.
At the end of each pass, the total number of errors that are made by the current L and
the default class are also stored. When there is no rule or no training transaction left,
we terminate this sub-step. After this sub-step, every MP in L can correctly classify at
least one training transaction in C .
In the third sub-step, though we would like to find as many MPs as possible to give
good coverage of the training transactions in the second sub-step, we prefer strong
MPs which have relatively high support and confidence, due to their characteristics of
corresponding to larger coverage and stronger differentiating power. Meanwhile, we
hope that the De-MP classifier, consisting of a combination of strong MPs , has a
MP a , MP b
1 R = sort( R ); // sort MPs based on their precedence
2 for each MP R in sequence do
3 temp = ;
4 for each transaction d C do
5 if d satisfies the cluset of MP then
6 store d .ID in temp ;
7 if MP correctly classifies d then
8 insert MP at the end of L ;
9 delete the transaction who has ID in temp from C ;
10 selecting a default class for the current L ; // determine the default class based on majority class of
remaining transactions in C
11 end
12 compute the total number of errors of L ; // compute the total number of errors that are made by the
current L and the default class
13 end
14 Find the first MP in L with the lowest total number of errors and discard all the MPs after the MP in L ;
15 Add the default class associated with the above mentioned first MP to end of L ;
16 De-MP classifier = L
Fig. 2. The Second Step of the MOUCLAS Algorithm
Search WWH ::




Custom Search