Information Technology Reference
In-Depth Information
Figure 28. Dmine regression results for length of stay
We reduce the sample size to 1000 to see if the results are similar. Results indicate that the tree model
is now the best (Figure 29).
The r 2 value increases to 42% with the smaller set of observations compared to the previous value of
28%. The decision tree, too, becomes simpler (Figure 30). In this result, the use of venous catheterization,
congestive heart failure, and the patient's age are the only three variables used. None of the procedures
are significant. This r 2 value appears to be much more reasonable compared to models in Chapter 3.
We consider one last series of models. We categorize length of stay and define the target variable
as ordinal. We use quantiles and define the target as ordinal. Note that a misclassification rate has been
added to the average error (Figure 31). Standard regression is the optimcal model.
The decision tree results are almost identical. Both procedures and diagnoses are statistically sig-
nificant in the tree (Figure 32). However, many of the branches to the tree are related to age. It suggests
that we should also have categorized age.
We will show in the next section how a predictive model can be used to rank the quality of providers.
This can be done with binary outcomes such as mortality, and with continuous outcomes such as length
of stay. Most commonly, the outcome used to define quality is that of mortality.
Figure 29. Model comparison for length of stay for 1000 observations
Search WWH ::




Custom Search