Graphics Reference
In-Depth Information
Finally, we should also consider two important practical factors in FS:
Speed of the FS method: It is concerned with the complexity of the FS itself.
When dealing with large data sets, some FS techniques are impractical to be run,
especially the exhaustive ones.Wrapper methods are usually slower than filters and
sometimes it may be crucial to determine the best FS choice under time constraints.
Generality of the selected features: It is concerned with the case of the estimation
of a good subset of features the as general as possible to be used with any DM
algorithm. It is a data closer and allows us to detect the most relevant or redundant
features of any application. Filters are thought to be more appropriate for this
rather than wrapper based feature selectors.
7.3.3 Drawbacks
FS methods, independent of their popularity, have several limitations:
The resulted subsets of many models of FS (especially those obtained by wrapper-
based approaches) are strongly dependent on the training set size. In other words,
if the training data set is small, the subset of features returned will also be small,
producing a subsequent loss of important features.
It is not true that a large dimensionality input can always be reduced to a small
subset of features because the objective feature (class in classification) is actually
related with many input features and the removal of any of them will seriously
effect the learning performance.
A backward removal strategy is very slow when working with large-scale data
sets. This is because in the firsts stages of the algorithm, it has to make decisions
funded on huge quantities of data.
In some cases, the FS outcome will still be left with a relatively large number of
relevant features which even inhibit the use of complex learning methods.
7.3.4 Using Decision Trees for Feature Selection
The usage of decision trees for FS has one major advantage known as “anytime”
[ 47 ]. Decision trees can be used to implement a trade-off between the performance
of the selected features and the computation time which is required to find a subset.
Decision tree inducers can be considered as anytime algorithms for FS, due to the
fact that they gradually improve the performance and can be stopped at any time,
providing sub-optimal feature subsets. In fact, decision trees have been used as an
evaluation methodology for directing the FS search.
 
Search WWH ::




Custom Search