Database Reference
In-Depth Information
There are other ways to obtain random forests. For example, instead of
using all the instances to determine the best split point for each feature, a
sub-sample of the instances is used [ Kamath and Cantu-Paz (2001) ] .This
sub-sample varies with the feature. The feature and split value that optimize
the splitting criterion are chosen as the decision at that node. Since the split
made at a node is likely to vary with the sample selected, this technique
results in different trees which can be combined in ensembles.
Another method for randomization of the decision tree through his-
tograms was proposed by Kamath et al . (2002). The use of histograms
has long been suggested as a way of making the features discrete, while
reducing the time to handle very large datasets. Typically, a histogram is
created for each feature, and the bin boundaries used as potential split
points. The randomization in this process is expressed by selecting the split
point randomly in an interval around the best bin boundary.
Although the random forest was defined for decision trees, this
approach is applicable to all types of classifiers. One important advantage
of the random forest method is its ability to handle a very large number of
input attributes [ Skurichina and Duin (2002) ] . Another important feature
of the random forest is that it is fast.
9.4.2.4 Rotation Forest
Similarly to Random Forest, the aim of Rotation Forest is to independently
build accurate and diverse set of classification trees. Recall that in Random
Forest the diversity among the base trees is obtained by training each tree
on a different bootstrap sample of the dataset and by randomizing the
feature choice at each node. On the other hand, in Rotation forest the
diversity among the base trees is acheived by training each tree on the whole
dataset in a rotated feature space. Because tree induction algorithms split
the input space using hyperplanes parallel to the feature axes, rotating the
axes just before running the tree induction algorithm, may result with a
very different classification tree.
More specifically the main idea is to use feature extraction methods
to build a full feature set for each tree in the forest. To this end, we first
randomly split the feature set into K mutually exclusive partitions. Then
we use a principal component analysis (PCA) separately on each feature
partition. PCA is a well-established statistical procedure that was invented
in 1901 by Karl Pearson. The idea of PCA is to orthogonaly transforms
possibly correlated features into a set of linearly uncorrelated features
(called principal components). Each component is a linear combination
Search WWH ::




Custom Search