Database Reference
In-Depth Information
13.3.1
Feature Filters
Filter methods, the earliest approaches for feature selection, use general
properties of the data in order to evaluate the merit of feature subsets.
As a result, filter methods are generally much faster and practical than
wrapper methods, especially for use on data of high dimensionality.
13.3.1.1 FOCUS
The FOCUS algorithm is originally designed for attributes with Boolean
domains [ Almuallim and Dietterich (1994) ] . FOCUS exhaustively searches
the space of feature subsets until every combination of feature values is
associated with one value of the class. After selecting the subset, it is passed
to the ID 3 algorithm which constructs a decision tree.
13.3.1.2 LVF
The LVF algorithm [ Liu and Setiono (1996) ] is consistency-driven and can
handle noisy domains if the approximate noise level is known apriori .
During every round of implementation, LVF generates a random subset
from the feature subset space. If the chosen subset is smaller than the
current best subset, the inconsistency rate of the dimensionally reduced
data described by the subset is compared with the inconsistency rate of the
best subset. If the subset is at least as consistent as the best subset, the
subset replaces the best subset.
13.3.1.3 Using a Learning Algorithm as a Filter
Some works have explored the possibility of using a learning algorithm as a
pre-processor to discover useful feature subsets for a primary learning algo-
rithm. Cardie (1995) describes the application of decision tree algorithms for
selecting feature subsets for use by instance-based learners. In [Provan and
Singh (1996)], a greedy oblivious decision tree algorithm is used to select
features to construct a Bayesian network. Holmes and Nevill- Manning
(1995) apply Holte's (1993) 1R system in order to estimate the predictive
accuracy of individual features. A program for inducing decision table
majority classifiers used for selecting features is presented in [ Pfahringer
(1995) ] .
Decision table majority (DTM) classifiers are restricted to returning
stored instances that are exact matches with the instance to be classified.
When no instances are returned, the most prevalent class in the training
Search WWH ::




Custom Search