Graphics Reference
In-Depth Information
Chapter 10
A Data Mining Software Package Including
Data Preparation and Reduction: KEEL
Abstract KEEL software is an open sourceDataMining tool widely used in research
and real life applications.Most of the algorithms described, if not all of them, through-
out the topic are actually implemented and publicly available in this Data Mining
platform. Since KEEL enables the user to create and run single or concatenated
preprocessing techniques in the data, such software is carefully introduced in this
section, intuitively guiding the reader across the step needed to set up all the data
preparations that might be needed. It is also interesting to note that the experimen-
tal analyses carried out in this topic have been created using KEEL, allowing the
consultant to quickly compare and adapt the results presented here. An extensive
revision of Data Mining software tools are presented in Sect. 10.1 . Among them,
we will focus on the open source KEEL platform in Sect. 10.2 providing details of
its main features and usage. For the practitioners interest, the most common used
data sources are introduced in Sect. 10.3 and the steps needed to integrate any new
algorithm in it in Sect. 10.4 . Once the results have been obtained, the appropriate
comparison guidelines are provided in Sect. 10.5 . The most important aspects of the
tool are summarized in Sect. 10.6 .
10.1 Data Mining Softwares and Toolboxes
As we have indicated in Chap. 1 , Data Mining (DM) is the process for automatic
discovery of high level knowledge by obtaining information from real world, large
and complex data sets [ 1 ], and is the core step of a broader process, called KDD. In
addition to the DM step, the KDD process includes application of several preprocess-
ing methods aimed at faciliting application of DM algorithms and postprocessing
methods for refining and improving the discovered knowledge. The evolution of the
available techniques and their wide adoption demands to gather all the steps involved
in the KDD process in the least amount of pieces of software as possible for the sake
of easier application and comparisons among the results obtained, yet allowing non
expert practitioners to have access to KDD techniques.
 
Search WWH ::




Custom Search