Database Reference
In-Depth Information
CART can be applied to both categorical and quantitative response
variables. When CART is applied with a quantitative response variable,
the procedure is known as “Regression Trees”. At each step, heterogeneity
is now measured by the within-node sum of squares of the response:
i ( τ )= ( y i
y ( τ )) 2 ,
(8.1)
where for node τ the summation is over all cases in that node, and y ( τ )is
the mean of those cases. The heterogeneity for each potential split is the
sum of the two sums of squares for the two nodes that would result. The
split is chosen that reduces most this within-nodes sum of squares; the sum
of squares of the parent node is compared to the combined sums of squares
from each potential split into two offspring nodes. Generalization to Poisson
regression (for count data) follows with the deviance used in place of the
sum of squares.
Several software packages provide implementation of regression tree
models. MATLAB provides the REPTree function that fits a regression
tree model according to the threshold and tree pruning setting. The party
package in R provides a set of tools for training regression trees. Section
10.3 presents a walk-through-guide for building Regression Trees in R.
8.3 Survival Trees
Survival analysis is a collection of statistical procedures for data analysis for
which the outcome variable of interest is the time until an event occurs. An
event might be a death, machine end of life, disease incidence, component
failure, recovery or any designated experience of interest that may happen
to an individual/object on which the study is focused.
The main goals of survival analysis are estimating the relative effects
of risk factors and predicting survival probabilities/survival times. Cox's
proportional hazards model is probably the most obvious way to fulfill the
first goal by assuming there is a linear association between survival time
and covariates.
Survival analysis has been used extensively in medical research. Most of
the work in the medical field focuses on studying survival curves of patients
suffering from different ailments. However, survival analysis can be used in
the engineering domain. For example, Eyal et al . (2014) show that survival
analysis can be used to predict failures of automobiles. Automobiles can
be considered analogous to humans due to their complexity and the wide
variety of factors which impact their life expectancy.
Search WWH ::




Custom Search