Information Technology Reference
In-Depth Information
to or close to the median. Consider reporting the median in the first
place.
The geometric mean is more appropriate than the arithmetic in three
sets of circumstances:
1. When losses or gains can best be expressed as a percentage rather
than a fixed value.
2. When rapid growth is involved as in the development of a
bacterial or viral population.
3. When the data span several orders of magnitude, as with the
concentration of pollutants.
Because bacterial populations can double in number in only a few
hours, many government health regulations utilize the geometric rather
than the arithmetic mean. 7 A number of other government regulations
also use it, though the sample median would be far more appropriate. 8
Whether you report a mean or a median, be sure to report only a sen-
sible number of decimal places. Most statistical packages can give you 9 or
10. Don't use them. If your observations were to the nearest integer, your
report on the mean should include only a single decimal place. For guides
to the appropriate number of digits, see Ehrenberg [1977]; for percent-
ages, see van Belle [2002, Table 7.4].
The standard error is a useful measure of population dispersion if the
observations come from a normal or Gaussian distribution. If the observa-
tions are normally distributed as in the bell-shaped curve depicted in
Figure 7.1, then in 95% of the samples we would expect the sample mean
to lie within two standard errors of the mean of our original sample.
But if the observations come from a nonsymmetric distribution like an
exponential or a Poisson, or a truncated distribution like the uniform, or a
mixture of populations, we cannot draw any such inference.
Recall that the standard error equals the standard deviation divided by
Â
2
(
)
xx
nn
-
i
the square root of the sample size,
SE
=
.
Because the stan-
(
)
-
1
dard error depends on the squares of individual observations, it is particu-
larly sensitive to outliers. A few extra large observations will have a
dramatic impact on its value.
If you can't be sure your observations come from a normal distribution,
then consider reporting your results either in the form of a histogram (as
in Figure 7.2) or in a box and whiskers plot (Figure 7.3). See also Lang
and Secic [1997, p. 50].
7 See, for example, 40 CFR part 131, 62 Fed. Reg. 23004 at 23008 (28 April 1997).
8 Examples include 62 Fed. Reg. 45966 at 45983 (concerning the length of a hospital stay)
and 62 Fed. Reg. 45116 at 45120 (concerning sulfur dioxide emissions).
Search WWH ::




Custom Search