Evaluating Case Selection Algorithms for Analogical Reasoning Systems - Foundations on Natural and Artificial Computation - page 346

Information Technology Reference

In-Depth Information

1. Calculate

{

M i ,i

∈{ 1 ···

f

}}

, a randomly partition of M and M i

the complement

of each element M i .

2. For each M i apply the control and the case selection method ( M i σ ).

3. Validate the classifier using Cross-Validation where M i σ is the training set and M i

the test set.

4. Calculate the decision scores: reduction of the case memory, efficiency of the

method, and quality of the solution.

In the first step, the partition is made to identify the test and training sets. In the second

step, the case selection method is applied in order to reduce the case memory. Note

that the case selection method selected should not produce an adverse effect on the

ARS. In order to obtain an initial filtering of the case selection methods, we could

compare them with a control test. In our case, these control methods are: the random

selection process (removing 25,50 or 75 % of the cases from the case memory) and the

none selection (keeping the original case memory). Therefore, this methodology only

considers acceptable those case selection methods whose results improve or keep the

control methods.

Fig. 1. Evaluation methodology for case selection algorithms

The third step is a classical Cross-Validation process. Due to the fact that case se-

lection methods are used to improve ARSs, it seems reasonable to include the own

ARS at this step. However, this kind of systems (such as a CBR) could imply high

computationally-cost processes (e.g. similarity or adaptation functions) and the valida-

tion step implies a high number of iterations. Therefore, the custom cross-validation

presented (folder size f ) executes a case selection method using the training set M i σ ,

the test set M i , and the K-NN as classifier (iterating over i

,f ). The K-NN

has two components: local and global distances, where the global depends on local.

The local is the distance between the case attributes values, therefore its calculation

depends on the attribute type. In our evaluation there are just two types: numeric and

string of characters, and we call d num the distance between numeric values and d string

the distance between string values:

=1

,

···

Next Page

Foundations on Natural and Artificial Computation

Search WWH ::

Custom Search

Home