Environmental Engineering Reference
In-Depth Information
made are between the upper X% of one group with the upper Y% of another group.
Further statistical tests would have little or no meaning. Another common mistake is
to substitute the ''less than'' values with zero. This produces estimates of the mean
and median that are biased low.
There are many approaches available but not readily used by practitioners in
dealing with censored data. Some of these approaches require an in-depth statistical
theory, and interested readers should refer to the suggested readings (Cohen, 1991;
Ginevan and Splitstone, 2004; Helsel, 2004; Manly, 2001). Fortunately the U.S.
EPA, in a manual on practical methods of data analysis, recommended a very simple
approach depending on the degree of data censoring (EPA, 1998):
With less than 15% of values censored, replace these values with DL
(detection limit), DL/2, or a very small value.
With 15-50% of censored values, use maximum likelihood estimate (MLE),
or estimate the mean excluding the same number of large values as small
values.
With more than 50% of the values censored, just base an analysis on the
proportion of data values above a certain level.
The simple substitution method, although widely used, has no theoretical basis.
The MLE method assumes a distribution of the data and the likelihood function
(which depends on both the observed and the censored values) is maximized to
estimate population parameters. To use these various methods, a computer
program called UNCENSOR can be downloaded for censored data analysis
(Newman et al., 1995). These data analysis tools are seldom found in standard
statistical packages.
2.2.6 Analysis of Spatial and Time Series Data
Often, environmental samples are taken for a period of time at a specific location or a
snapshot concentration pattern related to samples' physical locations in a two-
dimensional space around a source are taken. Data acquired through these sampling
plans are either concentrations vs. time (temporal data) or concentrations vs. x and y
(spatial data). The researcher may just want to know the average temporal
concentration (daily, weekly, monthly, or yearly average), so that this temporal
average can be compared to background concentrations or to regulatory standard
concentrations. In most cases, the researcher may further want to know whether the
concentrations present any temporal or spatial patterns.
One should be cautious that averaging all temporal or spatial data to obtain the
mean concentration and standard deviation is not a statistically sound approach, if
these data show serious temporal or spatial patterns. In applying the standard
approach of calculating mean and standard deviation, we have assumed the
randomness of the data. The randomness is apparently violated for data with
temporal or spatial patterns. In either case, such patterns should be identified and
particular statistical tools should be used.
Search WWH ::




Custom Search