Information Technology Reference
In-Depth Information
cessing before they are ready to use. It is a necessary component of research using administrative data;
yet few studies discuss the process in detail. For this reason, the preprocessing methodology cannot be
validated. Since the conclusions in the study depend upon the validity of the preprocessing, without some
knowledge of the preprocessing performed, the conclusions must always remain in doubt. We consider
a paper by Loughlin, et. al. in detail.(Loughlin et al., 2002) In this study, patients with an ICD9 diag-
nosis code contained within the interval 140 through 208 were extracted for a 210-day period, and then
only included patients with a prescription for transdermal fentanyl in the period. However, there were
no details provided as to how this extraction was performed. Moreover, the study goes on to define a
propensity score by first identifying factors that were statistically significant with respect to transdermal
fentanyl use. While stating that patient diagnoses were significant, the paper did not discuss how the
number of diagnoses that exist in the patient dataset was reduced to three. Once the sub-population of
patients was identified, the extracted factors were used to examine costs. However, without knowing
if the sub-population was correctly identified, it is not possible to conclude that the results are correct.
Similar problems exist in other studies.(Dobie et al., 2008; Epstein, Knight, Epstein, Bride, & Nichol,
2007; Hoy et al., 2007)
A study by Carsos, Zhu, and Zavras used ICD9 codes to define the study population, and also the
treatment outcomes. (Cartsos, Zhu, & Zavras, 2008) It stratified the sample by using the outcomes,
identifying a control population as not having these outcomes. Stratifying outcomes virtually guarantees
that the model results will be statistically significant. However, the biggest problem with this analysis is
that the outcome is a rare occurrence, and there is no indication in the paper that the statistical model was
modified to account for the rare occurrence. Unfortunately, ignoring needed modifications to examine rare
occurrences is a fairly common practice in retrospective studies. (Gross, Galusha, & Krumholz, 2007;
West, Behrens, McDonnell, Tielsch, & Schem, 2005) As a result, the model will inflate the results, and
the results will all be statistically significant while of no real practical importance. A study concerning
the relationship between patient outcome and hospital volume did not clearly define how volume was
computed from the data.(Xirasagar, Lien, Lee, Liu, & TC Tsai, 2008) Moreover, it is very possible that
hospitals with higher volume also have higher risk patients, suggesting that results must be modified
based upon patient severity.(Gilligan et al., 2007)
Wynn, Chang, and Peipins provided more details about the CPT and ICD9 codes used to extract the
patient population. (Wynn, Chang, & Peipins, 2007) It matched a cohort of women with ovarian cancer
to a control cohort matched on age, geographic region, Medicare eligibility, and health plan type. It did
not match on co-morbidities; instead, it examined the difference between the two cohorts only on the co-
morbidities related to women's health. It did examine some other co-morbid conditions such as diabetes,
but excluded others such as previous myocardial infarction. Moreover, it used a 3:1 ratio of treatment to
control group without explaining why 3:1 is optimal as opposed to say, a 1:1 or a 10:1 match.
Claims data can also be used to examine health disparities and access to services, especially when
matched to census data.(Halliday, Taira, Davis, & Chan, 2007) Such studies are particularly relevant to
the study of cancer screening, and compliance with that screening. (Mariotto, Etzioni, Krapcho, & Feuer,
2007) These studies can become time-oriented, looking at the time to screening, or the time between
screening and treatment. In this case, it is possible to introduce the methodology of survival data min-
ing with multiple time events. We can also have multiple time events when examining the data for the
occurrence of adverse events after surgery. (Baxter et al., 2007)
Another part of the decision is to determine both the inputs and outcomes under study. Many studies
focus on adherence to guidelines without going to the next step of investigating outcomes with respect
Search WWH ::




Custom Search