Information Technology Reference
In-Depth Information
Instance-based Family. In IB2 if an instance is misclassified by its nearest neighbour
then it is added to the final case [3,2]. Like CNN, the resulting memory could be highly
noisy. IB3 sets a more restrictive condition to keep a selected instances inside the final
case memory by reducing noise presence: if the instance in the memory has a low
acceptability level, i.e. low accuracy, then it is removed.
Shrink method [11], as RNN extends CNN. Shrink executes CNN and, finally, it
removes the instances missclassified from the resulting case memory by theirs neigh-
bours.
DROP Family. The reduction methods DROP1, DROP2 and DROP3 were introduced
as methodologies that provide noise tolerance, high generalisations accuracy, insensi-
tivity to the order of presentation of instances, and significant storage reduction [31].
All this method introduce the concept of associate, which it is the nearest neighbours to
a case of the same class. DROP1 is identical to RNN with the exception that the accu-
racy is checked on the selected instances compounding the final memory case, where
initially all the cases are by default selected. A case is deselected if at least as many of
its associates, previously selected, is classified correctly without it. The difference with
DROP2 is that the case is deselected if at least as many of its associates in the original
memory case would be classified correctly without it. DROP3 just executes ENN before
DROP1.
An Evolutionary Multiobjective Optimization Approach for Case Selection. The
selection of cases concerns finding the smallest subset of cases in a data base to obtain
the most accurate classification possible. Described more formally, lets suppose an ini-
tial case memory M where
X , the algorithm finds M σ
M ,removingthe
irrelevant or redundant cases, and obtaining good accuracy of the classification. For the
sake of clarity, since the algorithm could obtain different M σ
|
M
| =
sets, we denote them by
x , y , z ,etc.
Therefore, as in [10] for attribute selection, the problem of cases selection can be
approached as a multiobjective optimization problem, the solution of which comprise
as set of solutions called non-dominated solutions (or Pareto solutions). Given two so-
lutions x
= {
c
|
c
M
}
and y
= {
c
|
c
M
}
, solution x dominates solution y if [6]:
- Solution x is not worse than y for any of the purposes in mind;
- Solution x is strictly better than y for at least one of the objectives.
For the case selection problem in mind, two optimization criteria have been consid-
ered: accuracy and compactness . To formulate these criteria the following quantitative
measures have been defined.
Given a solution x of M ,wedefine:
Φ
(
x
)
- Accuracy. Based on the error ratio ER
is the number of
cases misclassified for a set of cases, x , by a given classification algorithm.
- Compactness. By cardinality
(
x
)=
,where Φ
(
x
)
|
x
|
|
x
|
, that is, the number of cases used to construct the
model.
 
Search WWH ::




Custom Search