Database Reference
In-Depth Information
In order to learn the above characteristics of a population or group of
components, speaking in terms of engineering systems, a sucient amount
of data is needed. The data should include enough instances of objects
in the population, information about several properties of each individual
(values of the explanatory variables), and the time of failure or at least
the maximal time in which a subject is known to have survived. The basic
terms of survival analysis are described below.
In survival analysis, tree structures are used for analyzing multi-
dimensional data by partitioning it and identifying local structures in
it. Tree methods are non-parametric, since they do not assume any
distributional properties about the data. Tree-based methods for univariate
and multivariate survival data have been proposed and evaluated by many
authors. Some of these works suggest extensions to the CART algorithm.
The leaves of the tree hold the hazard functions. The hazard function is
the probability that an individual will experience an event (for example,
death) within a small time interval, given that the individual has survived
up to the beginning of the interval.
Some survival trees are based on conditional hypothesis testing. This
framework provides a solution to models with continuous and categorical
data and includes procedures applicable to censored, ordinal, or multivari-
ate responses. In this framework, the conditional distribution of statistics
to measure the association between responses and factors. This would serve
as a basis for unbiased selection among factors. This algorithm, known as
Conditional Inference Trees (C-Tree algorithm), uses classical statistical
tests to select a split point. It is based on the minimum p -value among
all tests of independence between the response variable and each factor,
where p -value is the probability incorrectly rejecting the null hypothesis.
This framework derives unbiased estimates of factors and correctly han-
dles ordinal response variables. Statistically motivated stopping criteria
implemented via hypothesis tests yield a predictive performance that is
equivalent to the optimally pruned trees. Based on this observation, they
offered an intuitive and computationally ecient solution to the overfitting
problem.
In the recent years, survival tree methods became popular tools for
survival analysis. As survival trees are non-parametric models, they do
not require predefined distributional assumptions. When a single tree
framework is used, the data is split only by a subset of factors and the
rest are disregarded due to the trees stopping conditions, e.g. minimum
number of observations in a terminal node. Therefore, a single tree-based
Search WWH ::




Custom Search