Database Reference
In-Depth Information
5. Instance Selection Using Evolutionary
Algorithms: An Experimental Study
José Ramón Cano, 1 Francisco Herrera, 2 and Manuel Lozano 2
1
Dept. of Computer Science, Escuela Politecnica Superior de Linares,
University of Jaén, 23700 Jaén, Spain; email: jrcano@decsai.ugr.es
2
Dept. of Computer Science and Artificial Intelligence, Escuela Tecnica
Superior de Ingenieria Informatica, University of Granada, 18071 Granada,
Spain; email: herrera, lozano@decsai.ugr.es
In this chapter, we carry out an empirical study of the performance of four
representative evolutionary algorithm models considering two instance-selection
perspectives, the prototype selection and the training set selection for data
reduction in knowledge discovery. This study includes a comparison between
these algorithms and other nonevolutionary instance-selection algorithms. The
results show that the evolutionary instance-selection algorithms consistently
outperform the nonevolutionary ones, offering two main advantages
simultaneously, better instance-reduction rates and higher classification accuracy.
5.1 Introduction
The digital technologies and computer advances with booming Internet use have
led to massive data collection and information. Research in areas of science from
astronomy to the human natural genome is facing the same problem choking on
information. Raw data are rarely of direct use, and manual analysis simply cannot
keep pace with the fast growth of data. Knowledge discovery (KD) [34] and data
mining (DM) [1] help us; they aim to turn raw data into nuggets and create special
edges.
KD processes include problem comprehension , data comprehension , data
preprocessing , DM , evaluation, and development [1], [8], [35]. The first three
processes (problem and data comprehension and data preprocessing) play a pivotal
role in successful DM.
Due to the enormous amounts of data, much of the current research is based on
scaling up DM algorithms. Research has also worked on scaling down data. The
major issue of scaling down data is to select the relevant data and then present
them to a DM algorithm [25]. This task is developed in the data-preprocessing
phase in the KD process.
Data preprocessing presents the following strategies: data reduction , data
cleaning , data construction , data integration , and data format change . Our
attention is focused on data reduction . Data reduction can be achieved in many
ways:
Search WWH ::




Custom Search