Information Technology Reference
In-Depth Information
Figure 36. Lift function for revised prediction model
Figure 36 gives the lift function for the testing set. It indicates that MRSA can be predicted for the
first 4 deciles. Therefore, we can now identify the high risk patients in the first four deciles. Moreover,
the patients in the last 3 deciles have a low risk of MRSA and would not require any prophylactic treat-
ment.
future trends
This chapter clearly demonstrates the use of text mining to define patient severity indices. It gives results
that are far superior to those currently in use defined using logistic regression. Future trends will be to
convert more indices to text mining analysis. Because it is not subject to the problems of regression
models, it is not as susceptible to the upcoding by providers (links between codes become more im-
portant than the addition of codes). For this reason, text-based severity measures should take the place
of regression-based measures. Because text mining takes advantage of the linkage between conditions,
this procedure is not concerned with the requirement of independence between codes. Because text is
considered to be somewhat unique, this technique also does not have to make the assumption of the
uniformity of data entry. The entry is not uniform, and we can compare differences in coding across
providers as well as differences across patients.
However, the practice of comparing the observed rate of mortality to the actual rate of mortality should
be reconsidered as well. It penalizes hospitals with very low rates of actual mortality and rewards hospitals
with very high rates of mortality because there can be a higher differential between actual and predicted
rates when the actual rate is high. While it can be very satisfactory to develop a “scorecard” to rank
providers, quality of care is multi-dimensional, and it should be examined in multiple dimensions.
While the National Inpatient Sample, used here, defines the datafile so that the text strings of diag-
nosis and procedure codes are fairly simple to define, we will show how they can be defined using more
complex claims data in the next chapter.
dIscussIon
Text mining concentrates on the linkage between patient diagnoses and procedures, assuming that there
are relationships. This assumption should be quite reasonable-procedures should be related to diagnoses,
and there are co-morbid conditions that are related such as diabetes and heart disease. The relationships
are assumed to be textual rather than linear. Moreover, it is not necessary to assume the uniformity of
data entry when defining a patient severity index using text mining.
An index defined using text mining can be used for purposes other than to rank the quality of provid-
Search WWH ::




Custom Search