Environmental Engineering Reference
In-Depth Information
6
Statistics
Statistics provide very powerful tools for analyzing data, including tools for analyzing
sampling activities. Some of these common tools are determining the number of samples
needed, standard deviation, regression analysis, and extraneous values, and predicting the
component values between sampled points. In all cases calculators or computers
equipped with standard software, including spreadsheets, can be used to do statistical
calculations. In order to understand what information the statistic is providing, however,
and to be able to explain this to others, it is important to know how the statistic is
calculated. This is also important when trying to understand what others are trying to
explain using statistics.
It is common to hear people talk about median, mean, average, one standard deviation,
and so on. When such conversations occur, are the speakers using the terms correctly?
How does what they are saying affect your sampling plan? On the other hand, can the
statistics being used in the sampling plan be explained to others?
There are a great many symbols used in statistics. It is important to be able to keep
these in mind while studying and interpreting statistical results. Table 6.1 gives some of
the most commonly used symbols and the terms they represent. These will be particularly
important, whether calculation of statistics is done by hand or using a computer, because
once the computer has done the calculation the experimenter must still interpret the
statistical results.
Statistics can be calculated by hand; however, in most cases calculations will be
carried out using computers and more or less complicated statistics programs. Although
for simple calculations a handheld calculator can be used, in all cases a computer will be
preferred. In the case of geostatistics the calculations are never done by hand or handheld
calculators because the equations are too complex. Keep in mind that some statistics
programs are difficult to learn and use, while others are not. In addition, some statistics
programs leave the experimenter in charge; some take charge.
For example, some statistics programs require setting up the statistics or the sampling
plan before obtaining or entering the data. Other programs allow the input of the data and
subsequent application of various statistical calculations. Because it is not always clear
what information will be needed or which statistic might need to be applied, the latter
approach is usually preferable. In Figure 6.1, after the data are entered it is found that two
data points are missing (X in A2 and Y in A6). If all the statistics had been set up
beforehand [e.g., the number of samples and the degrees of freedom ( n and n −1)], the
whole setup would have to be changed because of the missing data points.
If the data are entered and the statistics subsequently calculated, two possible
adjustments could be made. First the data could be calculated
 
Search WWH ::




Custom Search