Database Reference
In-Depth Information
Fig. 10.8
J48 tree visualization.
indicating the ensemble size (i.e. the number of iterations). Figure 10.9
presents the properties list of the Rotation Forest algorithm. Note that a
classifier property should refer to a decision tree algorithm (such as J48)
in order to actually build a forest. In many cases altering the number of
iterations by a trial-and-error procedure is very helpful for improving the
predictive performance.
10.3 R
R is a free software programming language that is widely used among data
scientists to develop data mining algorithms. Most of the data scientist's
tasks can be accomplished in R with a short script code thanks to the
diversity and richness of the contributed packages in CRAN (comprehensive
R archive network). In this section, we review the packages: party , rpart
and randomForest which are used to train decision trees and decision
forests.
10.3.1
Party Package
The party package provides a set of tools for training classification and
regression trees [ Hothorn et al . (2006) ] . At the core of the party package
there is the ctree() function which implements a conditional inference
procedure to train the tree. The following R script illustrates the usage of
the ctree function for training a classification tree.
Search WWH ::




Custom Search