Information Technology Reference
In-Depth Information
|
x i
y i |
d num (
x i ,y i )=
(4)
|
max
(
x i )
max
(
y i ) |
1
if x i =
y i
d string (
x i ,y i )=
(5)
0
if x i =
y i
where max
(
x i )
and max
(
y i )
are the maximum domain values for the attributes x i
and y i .
Given two particular cases x
=(
x 1 ,
···
,x n )
and y
=(
y 1 ,...,y n )
with the same
number of attributes, the global euclidean distance is defined as:
n
D
(
x, y
)=
d i (
x i ,y i ) 2 .
(6)
i =1
where d i is d num or d string depending on the nature of the i-th attribute.
In the final step of the methodology proposed the reduction of the case memory is
evaluated by the reduction rate function ρ (expression 2). The efficiency is calculated
by the average of the execution time of the reduction process. Finally, we consider
the generalised error rate to evaluate the classification, the κ coefficient to evaluate the
coincidence of the solution and specificity and sensitivity values to analyse particular
problems of the domain application.
3.3
Case Selection Methods
In this work, we analyse three main families of case selection algorithms: NN algo-
righms, instance-based algorithms, and drop algorithms. Since the evaluation criteria of
the case selection methods are their reduction capacity and the accuracy of the classi-
fier, we could consider it as a multiobjective problem. Therefore, we propose the use of
evolutive algorithms to implement case selection.
NN Family. One of the first attempts to reduce case memory size was CNN [9]. Starting
with a random case of each class as final case memory, the method takes the rest of them
as test set and classifies them using the selected cases as K-NN classifier. If a case is
misclassified then it is added to the final case memory. The process stops when all the
original cases are right classified. Although the reduction is possible, nevertheless it
does not check for noisy cases, for this reason RNN [7] was introduced as an extension
of CNN to remove noise instances from the resulting case memory after applying CNN.
Each case of the final case memory is removed from it, and if no case from the original
memory is misclassified then the candidate is finally removed, in other case it is kept.
Unlike RNN, ENN does not start with the output memory of CNN. ENN removes
misclassified cases by their three nearest neighbours [29]. Consequently, if ENN is
executed multiple times, taking each output as input of the next execution , then the
method is called RENN. All-KNN consists of executing k times ENN, where each
execution uses from 1 to k neighbours respectively to flag a case for no selection [27].
Some authors claim ENN and its variations are actually noise remove techniques [31].
Search WWH ::




Custom Search