Evaluating Case Selection Algorithms for Analogical Reasoning Systems - Foundations on Natural and Artificial Computation

Information Technology Reference

In-Depth Information

Instance-based Family. In IB2 if an instance is misclassified by its nearest neighbour

then it is added to the final case [3,2]. Like CNN, the resulting memory could be highly

noisy. IB3 sets a more restrictive condition to keep a selected instances inside the final

case memory by reducing noise presence: if the instance in the memory has a low

acceptability level, i.e. low accuracy, then it is removed.

Shrink method [11], as RNN extends CNN. Shrink executes CNN and, finally, it

removes the instances missclassified from the resulting case memory by theirs neigh-

bours.

DROP Family. The reduction methods DROP1, DROP2 and DROP3 were introduced

as methodologies that provide noise tolerance, high generalisations accuracy, insensi-

tivity to the order of presentation of instances, and significant storage reduction [31].

All this method introduce the concept of associate, which it is the nearest neighbours to

a case of the same class. DROP1 is identical to RNN with the exception that the accu-

racy is checked on the selected instances compounding the final memory case, where

initially all the cases are by default selected. A case is deselected if at least as many of

its associates, previously selected, is classified correctly without it. The difference with

DROP2 is that the case is deselected if at least as many of its associates in the original

memory case would be classified correctly without it. DROP3 just executes ENN before

DROP1.

An Evolutionary Multiobjective Optimization Approach for Case Selection. The

selection of cases concerns finding the smallest subset of cases in a data base to obtain

the most accurate classification possible. Described more formally, lets suppose an ini-

tial case memory M where

X , the algorithm finds M σ ⊆

M ,removingthe

irrelevant or redundant cases, and obtaining good accuracy of the classification. For the

sake of clarity, since the algorithm could obtain different M σ

| =

sets, we denote them by

x , y , z ,etc.

Therefore, as in [10] for attribute selection, the problem of cases selection can be

approached as a multiobjective optimization problem, the solution of which comprise

as set of solutions called non-dominated solutions (or Pareto solutions). Given two so-

lutions x

= {

∈

}

and y

= {

∈

}

, solution x dominates solution y if [6]:

- Solution x is not worse than y for any of the purposes in mind;

- Solution x is strictly better than y for at least one of the objectives.

For the case selection problem in mind, two optimization criteria have been consid-

ered: accuracy and compactness . To formulate these criteria the following quantitative

measures have been defined.

Given a solution x of M ,wedefine:

(

)

- Accuracy. Based on the error ratio ER

is the number of

cases misclassified for a set of cases, x , by a given classification algorithm.

- Compactness. By cardinality

(

,where Φ

(

)

, that is, the number of cases used to construct the

model.

Foundations on Natural and Artificial Computation

Search WWH ::

Custom Search

Home