Database Reference
In-Depth Information
In the literature, there are several works that deal with feature set
partitioning. In one research, the features are grouped according to the
feature type: nominal value features, numeric value features and text
value features [ Kusiak (2000) ] . A similar approach was also used for
developing the linear Bayes classifier [ Gama (2000) ] . The basic idea consists
of aggregating the features into two subsets: the first subset containing only
the nominal features and the second only the continuous features.
In another research, the feature set was decomposed according to
the target class [ Tumer and Ghosh (1996) ] . For each class, the features
with low correlation relating to that class were removed. This method
was applied on a feature set of 25 sonar signals where the target was
to identify the meaning of the sound (whale, cracking ice, etc.). Feature
set partitioning has also been used for radar-based volcano recognition
[ Cherkauer (1996) ] . The researcher manually decomposed a feature set of
119 into 8 subsets. Features that were based on different image processing
operations were grouped together. As a consequence, for each subset,
four neural networks with different sizes were built. A new combining
framework for feature set partitioning has been used for text-independent
speaker identification [ Chen et al . (1997) ] . Other researchers manually
decomposed the features set of a certain truck backer-upper problem and
reported that this strategy has important advantages [ Jenkins and Yuhas
(1993) ] .
The feature set decomposition can be obtained by grouping features
based on pairwise mutual information, with statistically similar features
assigned to the same group [ Liao and Moody (2000) ] . For this purpose, one
can use an existing hierarchical clustering algorithm. As a consequence,
several feature subsets are constructed by selecting one feature from each
group. A neural network is subsequently constructed for each subset. All
networks are then combined.
In statistics literature, the well-known feature-oriented ensemble algo-
rithm is the MARS algorithm [Friedman (1991)]. In this algorithm, a
multiple regression function is approximated using linear splines and their
tensor products. It has been shown that the algorithm performs an ANOVA
decomposition, namely, the regression function is represented as a grand
total of several sums. The first sum is of all basic functions that involve
only a single attribute. The second sum is of all basic functions that involve
exactly two attributes, representing (if present) two-variable interactions.
Similarly, the third sum represents (if present) the contributions from three-
variable interactions, and so on.
Search WWH ::




Custom Search