Evaluating Learning Algorithms Composed by a Constructive Meta-learning Scheme for a Rule Evaluation Support Method - Mining Complex Data

Information Technology Reference

In-Depth Information

be better classifier for the entire dataset than these hyper-plane learners and

bagged J4.8, they needed more training instances to become accurate classi-

fiers. Looking at the result of learning algorithm constructed by CAMLET, this

algorithm achieves almost the same performance as bagged J4.8, with smaller

training subset. However, it can outperform bagged J4.8 with larger training

subsets. Although the constructed algorithm was based on boosting, the com-

bination of a reinforcement method from Classifier Systems and the outer loop

was able to overcome the disadvantage of boosting for a smaller training subset.

Rule evaluation models for the meningitis data mining result dataset.

In this section, we present rule evaluation models for the entire dataset learned

using CAMLET, OneR, J4.8 and CLR. This is because they are represented as

explicit models such as a rule set, a decision tree, and a linear model set.

As shown in Fig. 6.5, the indices used in the learned rule evaluation models

are taken, not only from a group of indices that increases with the correctness of

a rule, but also from different groups of indices. Indices such as YLI1, Laplace

Correction, Accuracy, Precision, Recall, Coverage, PSI and Gini Gain are indices

that were formerly used for models. Later indices include GBI and Peculiarity,

which sums up the difference in antecedents between one rule and the other rules

in the same rule set. This corresponds to a comment made by the human expert.

He said that he evaluated these rules not only according to their correctness but

also their interestingness based on his expertise

Top 10 frequency in OneR models

Top 10 frequency in CAMLET models

Peculiarity

RelativeRisk

GBI

ChiSquare-one

MutualInformation

AddedValue

GOI

GBI

OddsRatio

Accuracy

Precision

Lift

LaplaceCorrection

BI

GiniGain

Coverage

KI

BC

YLI1

0

1000000 2000000 3000000 4000000 5000000

0

200

400

600

800

1000

1200

1400

1600

Top 10 frequency in CLR models

Top 10 frequency in J4.8 models

Peculiarity

GBI

LaplaceCorrection

MutualInformation

OddsRatio

J-Measure

Prevalence

Coverage

RelativeRisk

Precision

GBI

Recall

Accuracy

Credibility

LaplaceCorrection

GiniGain

Lift

OddsRatio

0

10000

20000

30000

40000

50000

0

5000

10000

15000

20000

Fig. 6.5. Top 10 frequencies of the indices used by the models of each learning algo-

rithm with 10000 bootstrap samples of the meningitis datamining result dataset and

executions

Mining Complex Data

Search WWH ::

Custom Search

Home