Text Mining and Patient Severity Clusters - Text Mining Techniques for Healthcare Provider Quality Determination

Information Technology Reference

In-Depth Information

Chapter 8

Text Mining and Patient

Severity Clusters

IntroductIon

The problem with using the diagnosis codes is that there are just too many to be able to use them all in

a predictive model or regression. The requirements of a predictive model are that categorical data have

just a small number of levels; this requirement will lead to the need to compress the number of levels

in the variable. Therefore, thus far, there is a predetermined list of codes that count in risk adjustment,

leaving many codes not included (as in the case of the Charlson Index). Otherwise, consensus panels are

used to determine categories of severity, as in the case of the APRDRG Index. We have shown that in

many cases, some of the omitted codes include as much, if not more, risk compared to those codes that

are included; patients with the omitted conditions will be identified as less severe compared to patients

with included conditions. In this chapter, we will introduce a method that can compress the diagnoses into

clusters while still using all of the codes, without relying upon consensus panels. Moreover, outcomes

are not used to define the severity index, so they can be used to validate the model; outcomes can then

be used to consider the quality of providers.

Perhaps the major reason to use the modeling here is that the methodology described does not require

that the diagnosis codes used are independent as is required for regression models; in fact, the modeling

Search WWH ::

Custom Search

Home