Information Technology Reference
In-Depth Information
Table 1. Summary statistics for pneumonia
pneumonia
N Obs
Variable
Mean
Std Dev
Minimum
Maximum
N
0
7566548
LOS
TOTCHG
4.4941732
21779.98
6.7191416
38084.94
0
25.0000000
365.0000000
999926.00
7566151
7443500
1
425459
LOS
TOTCHG
7.1067977
32528.21
8.6216278
53918.78
0
35.0000000
356.0000000
998514.00
425421
420161
Table 2. Quartile values for pneumonia
pneumonia
N Obs
Variable
N
Lower Quartile
Median
Upper Quartile
0
7569333
LOS
TOTCHG
7568934
7446228
2.0000000
5583.00
3.0000000
11406.00
5.0000000
23807.00
1
425715
LOS
TOTCHG
425677
420417
3.0000000
8689.00
5.0000000
16437.00
8.0000000
33982.00
example limited to three diagnoses
We consider an example where the number of diagnoses is limited to 3 and where the weights, α, are
equal to one. We will examine the ability of the model to predict patient outcomes.
When using very large samples, all of the p-values will be statistically significant, but of little impor-
tance. For example, we want to consider whether pneumonia will increase the length of stay and the total
charges for patients. We also want to know if the linear model will have the outcome that is represented
in the kernel density graphs discussed in the previous chapter. Therefore, we look at Table 1, with the
summary statistics for length of stay and charges for patients with and without pneumonia.
Note that it appears as if pneumonia adds about two and a half days to a hospital stay at a cost of over
$10,000 more compared to patients without pneumonia. However, in a linear regression for length of stay,
the correlation coefficient, or r 2 value is equal to 0.007 while pneumonia is statistically significant. For
total charges, r 2 is equal to 0.003. This means that 0.7% of the variability in length of stay is explained by
the patient diagnoses; 0.3% of total charges are explained by the diagnoses.Therefore, while pneumonia
is significant, it explains very little of the variability in length of stay. One of the reasons that the linear
model shows such a difference is because of the presence of outliers. These outliers are considerable,
with an upper limit of 365 days at a charge of almost one million dollars. If we look at the median and
quartile values, the differences are not so great (Table 2).
Note that the difference in length of stay is two days at the median with about $5000 difference in cost.
The length of stay is a difference of 1 at the lower quartile, and a difference of 3 at the upper quartile.
Therefore, the difference to the median is greater at the upper quartile compared to the lower quartile.
We now consider adding the indicator function for Septicemia to the linear regression for both total
charges and length of stay (Table 3).
Unquestionably, the occurrence of Septicemia brings the length of stay to 12 days with or without
pneumonia, and brings the total charges to around $70,000. For total charges, this increases the r 2 to
0.038; it is equal to 0.037 for length of stay. Again, very little of the variability in the outcome can be
explained by the presence or absence of these two diseases (Table 4).
Note that pneumonia without septicemia adds two days to the length of stay and about $4000 if we
 
Search WWH ::




Custom Search