Database Reference
In-Depth Information
disjoint subsets F i,j of equal size M . For every subset, we randomly select
a non-empty subset of classes and then draw a bootstrap sample which
includes 3/4 of the original sample. Then we apply PCA using only the
features in F i,j and the selected subset of classes. The obtained coecients
of the principal components, a i, 1 ,a i, 1 ,... , are employed to create the sparse
“rotation” matrix R i . Finally, we use SR i from training the base classifier
M i . In order to classify an instance, we calculate the average confidence
for each class across all classifiers, and then assign the instance to the class
with the largest confidence.
The rotation forest algorithm is implemented as part of the Weka
suite. The matlab code of rotation forest can be obtained from: http://
www.mathworks.com / matlabcentral / fileexchange / 38792-rotation-forest-
algorithm.
The experimantal study conducted by the inventors of Rotation Forest,
show that Rotation Forest outperforms random forest in terms of accuracy.
However, Rotation Forest has two drawbacks. First, Rotation Forest
is usually more computationally intensive than Random Forest. Mainly
because the computational complexity of PCA is more than linear (more
specially mn 2 ,where m is the number of instances and n is the number
of features). Another drawback is that the nodes in the obtained trees are
the transformed features and not the original features. This can make it
harder for the user to understand the tree because instead of examining a
single feature in each node of the tree, the user needs to examine a linear
combination of the features in each node of the tree.
Zhang and Zhang (2008) present the RotBoost algorithm which
combines the ideas of Rotation Forest and AdaBoost. RotBoost achieves an
even lower prediction error than either one of the two algorithms. RotBoost
is presented in Figure 9.15. In each iteration a new rotation matrix is
generated and used to create a dataset. The AdaBoost ensemble is induced
from this dataset.
In conclustion, Rotation Forest is an ensemble generation method which
aims at building accurate and diverse classifiers [Rodriguez et al . (2006)].
The main idea is to apply feature extraction to subsets of features in order
to reconstruct a full feature set for each classifier in the ensemble. Rotation
Forest ensembles tend to generate base classifiers which are more accurate
than those created by AdaBoost and by Random Forest, and more diverse
than those created by bagging. Decision trees were chosen as the base classi-
fiers because of their sensitivity to rotation of the feature axes, while remain-
ing very accurate. Feature extraction is based on PCA which is a valuable
Search WWH ::




Custom Search