Database Reference
In-Depth Information
library (rpart)
fit
rpart(Species ˜ ., data =iris)
plot (fit)
text (fit , use.n = TRUE))
<
The function rpart() is capable to induce not only classification trees
but also regression trees, survival tree and Poisson tree. This is done by
setting the method argument to the following values: anova (for regression
tree), class (for classification tree), exp (for survival tree) or poisson
(for Poisson Regression tree). If the method argument is not set then the
function makes an intelligent guess. Note that Poisson regression is a type of
regression that aims to model event rate that follows a Poisson distribution.
For example, in a manufacturing scenario the target attribute may refer to
defective rate. Poisson regression assumes that the logarithm of the target
attribute's expected value can be represented as a linear combination of the
input attributes.
The function rpart() lets the user select the splitting criterion: gini
index or information gain. This is done by setting the split parameter
to either split = ''information'' or split=''gini'' . In addition,
similarly to ctree(), a control object can be included to set various
parameters such as minsplit , minbucket and maxdepth . The following
script illustrates this tuning capability by indicating that splitting criterion
is information gain and that the number of instances should exceed 5 in
order to consider splitting a certain node.
fit
<
rpart(Species ˜ ., data =iris,
split =''information ' ' ,
control =rpart. control ( minsplit = 5)
10.3.5
RandomForest
In the previous section, we have seen how the function cfores() can be
used to build an ensemble of classification trees. An alternative way to build
a random forest is to use the RandomForest package [ Liaw and Wiener
(2002) ] .
Search WWH ::




Custom Search