Databases Reference
In-Depth Information
given only the values for the selected features, is as close as possible to
the original class distribution given all feature values. 18
Definition 1:
Feature selection in supervised learning: feature selection
in supervised learning is the process of choosing a subset of the original
features that optimizes the predictive performance of a considered model
by eliminating the redundant features and those with little or no predictive
information.
Definition 2:
Feature selection in unsupervised learning: feature
selection in unsupervised learning is the process of choosing a subset of
the original variables that forms a high quality clustering for the given
number of clusters.
Consulting the above matter of fact, the approaches for selection of
relevant features are categorized below.
(1) Embedded Approach: It embeds the selection within the basic induction
algorithm usually weighting schemes is considered. 19,20
(2) Filter Approach: This method filters out irrelevant attributes before
induction occurs. FOCUS and RELIEF follow feature selection with
decision tree construction. In RELIEF, features are given weights but
as the redundant features have same weight so the method may select
a duplicate feature which increases complexity, 21 where as FOCUS
implements exhaustive search. PCA (Principal Component Analysis) 22
can reduce dimensionality. The approach is well described in Fig. 8.2.
Fig. 8.2.
Filter approach in attribute selection.
Search WWH ::




Custom Search