Databases Reference
In-Depth Information
Chapter 8
SOFT COMPUTING FOR FEATURE SELECTION
A. K. JAGADEV , ,S.DEVI and R. MALL , §
Department of Computer Science and Engineering,
ITER, SOA University, Bhubaneswar, India
a jagadev@yahoo.co.in
Department of Computer Science and Engineering,
Indian Institute of Technology, Kharagpur-721302, India
§ rajib@cse.iitkgp.ernet.in
Feature selection has been the focus of interest for quite some time and much
work has been done. It is in demand in areas of application for high dimensional
datasets with tens or hundreds of thousands of variables available. This survey
is a comprehensive overview of many existing methods from the 1970s to the
present; considering both soft and non-soft computing paradigm. The strengths
and weaknesses of different methods are explained and methods are categorized
according to generation procedures and evaluation functions. The objective of
feature selection is three fold: improving the prediction performance of the
predictors, providing faster and more cost-effective prediction and providing a
better understanding of the underlying process that generated the data. This
survey identifies future research areas in feature subset selection and introduces
newcomers to this field.
8.1. Introduction
A universal problem that all intelligent agents must face is where to
focus their attention, e.g., a problem-solving agent must decide which
aspects of a problem are relevant and so forth. The majority of real-world
classification problems require supervised learning where the underlying
class probabilities and class-conditional probabilities are unknown and
each instance is associated with a class label, i.e., relevant features are
often unknown a priori. In many applications, the size of a dataset is
so large that learning might not work as well before removing these
unwanted features. Theoretically, having more features implies more
discriminative power in classification. Many reduction algorithms have been
developed during past years. Generally, they can be divided into two broad
217
Search WWH ::




Custom Search