Information Technology Reference
In-Depth Information
samples under size 30, the interval is still suspect. The idea behind these
intervals comes from the observation that percentile bootstrap intervals
are most accurate when the estimate is symmetrically distributed about
the true value of the parameter and the tails of the estimate's distribution
drop off rapidly to zero. The symmetric, bell-shaped normal distribution
depicted in Figure 7.1 represents this ideal.
Suppose qis the parameter we are trying to estimate, is the estimate,
and we are able to come up with a monotone increasing transformation
m such that m (q) is normally distributed about m ( ). We could use this
normal distribution to obtain an unbiased confidence interval, and then
apply a back-transformation to obtain an almost-unbiased confidence
interval. 3
Even with these modifications, we do not recommend the use of the
nonparametric bootstrap with samples of fewer than 100 observations.
Simulation studies suggest that with small sample sizes, the coverage is far
from exact and the endpoints of the intervals vary widely from one set of
bootstrap samples to the next. For example, Tu and Zhang [1992] report
that with samples of size 50 taken from a normal distribution, the actual
coverage of an interval estimate rated at 90% using the BC a bootstrap is
88%. When the samples are taken from a mixture of two normal distribu-
tions (a not uncommon situation with real-life data sets) the actual cover-
age is 86%. With samples of only 20 in number, the actual coverage is 80%.
More serious when trying to apply the bootstrap is that the endpoints
of the resulting interval estimates may vary widely from one set of
bootstrap samples to the next. For example, when Tu and Zhang drew
samples of size 50 from a mixture of normal distributions, the average of
the left limit of 1000 bootstrap samples taken from each of 1000 simu-
lated data sets was 0.72 with a standard deviation of 0.16, and the average
and standard deviation of the right limit were 1.37 and 0.30, respectively.
q
q
Parametric Bootstrap
Even when we know the form of the population distribution, the use of
the parametric bootstrap to obtain interval estimates may prove advantage-
ous either because the parametric bootstrap provides more accurate
answers than textbook formulas or because no textbook formulas exist.
Suppose we know that the observations come from a normal distribu-
tion and want an interval estimate for the standard deviation. We would
draw repeated bootstrap samples from a normal distribution, the mean of
which is the sample mean and the variance of which is the sample variance.
3 Stata TM provides for bias-corrected intervals via its bstrap command. R- and S-Plus both
include BC a functions. A SAS macro is available at http://www.asu.edu/it/fyi/research/
helpdocs/statistics/SAS/tips/jackboot.html.
Search WWH ::




Custom Search