Evaluating Learning Algorithms Composed by a Constructive Meta-learning Scheme for a Rule Evaluation Support Method - Mining Complex Data

Information Technology Reference

In-Depth Information

Table 6.5. Overview of constructed learning algorithms by CAMLET to the datasets

of the rule sets learned from the UCI benchmark datasets

Distribution I

Distribution II

Distribution III

original

classifier set

overall

control structure

final

eval. method

original

classifier set

overall

control structure

final

eval. method

original

classifier set

overall

control structure

final

eval. method

Win+Boost+CS Weighted

Voting

Weighted

Voting

Weighted

Voting

anneal

C4.5 tree

Boost+CS

C4.5 tree

Boost+CS

Weighted

Voting

audiology ID3 tree

Boost

Voting

Random Rules CS+GA

Random Rules Simple Iteration

Best Select.

Weighted

Voting

Weighted

Voting

Weighted

Voting

autos

Random Rules Win+Iteration

ID3 tree

Boost+Iteration

Random Rules Boost

balance-

scale

Weighted

Voting

Random Rules Boost

Voting

Random Rules Boost+CS

Random Rules CS+GA

Voting

breast-

cancer

Boost+CS

+Iteration

Weighted

Voting

Weighted

Voting

Random Rules GA+Iteration

Voting

ID3 tree

Random Rules Win+Iteration

Weighted

Voting

Weighted

Voting

breast-w ID3 tree

Win

ID3 tree

Iteration

Best Select. ID3 tree

CS+Iteration

colic

Random Rules CS+Win

Voting

ID3 tree

Win+Iteration

Best Select. ID3 tree

Win+Iteration

Voting

credit-a C4.5 tree

Win+Iteration

Voting

Random Rules Win+Iteration

Best Select. ID3 tree

CS+Boost+IterationBest Select.

CS means including reinfoecement of classifier set from Classifiser Systems

Win means including methods and control structure from Window Strat

Boost means including methods and control structure from Boosting

GA means including reinforcement of classifier set with Genetic Algorit

distributions. The class distribution for “Distribution I” is P =(0 . 35 , 0 . 3 , 0 . 3)

where p i is the probability of class i . Thus, the number of class i instances in

each dataset D j become p i D j . Similarly, the probability vector of “Distribution

II” is P =(0 . 3 , 0 . 5 , 0 . 2) and that of “Distribution III” is P =(0 . 3 , 0 . 65 , 0 . 05).

Constructing proper learning algorithms for rule sets from UCI

datasets. In the same way as the construction of an appropriate learning algo-

rithm for the meningitis data mining result, we constructed appropriate learning

algorithms for the datasets of rule sets from the eight UCI datasets. Table6.5

shows an overview of the constructed learning algorithms for each dataset, which

had three different class distributions.

For these datasets, CAMLET constructed various learning algorithms based

on 'random rule set generation', ID3 decision tree, and C4.5 decision tree. There-

fore, these learning algorithms consisted of new combinations of methods that

had previously never been seen in learning algorithms. Most of the learning algo-

rithms include 'Voting' from bagging or 'Weighted Voting' from boosting. With

regard to these results, CAMLET constructed selective meta-learning algorithms

for the datasets with the three different class distributions.

Accuracy Comparison on Classification Performances. For the above

mentioned datasets, we used the five learning algorithms to estimate whether

their classification results reached or exceeded the accuracies when just pre-

dicting each majority class. Table 6.6 shows the accuracies of the nine learning

algorithms applied to each class distribution of the three datasets. The learn-

ing algorithms constructed by CAMLET, boosted J4.8, bagged J4.8, J4.8, and

BPNN always performed better than just predicting the majority class of each

dataset. In particular, Bagged J4.8 and Boosted J4.8 outperformed J4.8 and

BPNN for almost all datasets. However, their performances were suffered from

probabilistic class distributions for larger datasets, such as balance-scale and

credit-a.

Mining Complex Data

Search WWH ::

Custom Search

Home