Geoscience Reference
In-Depth Information
observed frequencies is equal to n . It can be shown (Cram ´ r 1947 ) that for this
reason one degree of freedom must be subtracted from the number of classes m .
More degrees of freedom are lost if, in order to obtain the theoretical frequencies
( f e ), use is made of parameters estimated from the observations; in general, the
number of degrees of freedom is to be reduced further by the number of parameters
that were estimated. Consequently, the chi-square test of normality has ( m
3)
2
degrees of freedom. For the example of Table 2.3 with
9.1, the number of
degrees of freedom is 4. From statistical tables, it can be found for
ˇ
¼
ʱ ¼
0.05 that
2
ˇ
9.49. Hence the normality hypothesis can be accepted. However, it
should be kept in mind that
0 : 95 (4)
¼
2
ˇ
0 : 941 (4)
¼
9.1. This means that a normal distribution
2 equal to or larger than 9.1 in only 5.9 % of events if this particular
experiment were to be repeated a large number of times for the same theoretical
distribution.
The preceding chi-square test for goodness of fit is well-known. It was originally
proposed by Karl Pearson and refined by Ronald Fisher who exactly determined the
number of degrees of freedom to be used. A similar test that is at least as good as
the chi-square test is the G 2 -test (see, e.g., Bishop et al. 1975 ). Finally, the
Kolmogorov-Smirnov test should be mentioned. It consists of determining the
largest (positive or negative) difference between theoretical and observed frequen-
cies. In the two-tailed Kolmogorov-Smirnov test, the absolute value of the largest
difference should not exceed 1.36/ n 0.5 with a probability of 95 % provided that the
number of observations exceeds 40. The corresponding confidence for
would yield a
ˇ
the
one-tailed test is 1.22/ n 0.5 .
2.4.2 Q-Q Plots: Normal Distribution Example
Normality can also be tested graphically by means of a so-called Q - Q plot for
comparing observed quantiles with theoretical quantiles. When the theoretical
frequency distribution is normal, this is the same as using normal probability
paper. In Fig. 2.6 , the scale along the vertical axis is linear but the horizontal
scale has been changed in such a manner that the S-shaped curve for any theoretical
cumulative normal distribution plots as a straight line. A normal distribution always
becomes a straight line on normal probability paper. Figure 2.6 shows three types of
plot for the 76 biotite ages listed in Table 2.3 : (1) original data (points); (2) theo-
retical normal curve (straight line); (3) a 95 % confidence belt on the theoretical
normal curve. These three plots have been constructed as follows:
Firstly, cumulative frequencies were determined for the classes of ages shown
in Table 2.2 . These were converted into cumulative frequency percentage values.
If upper class limits are used, it is not possible to plot the value for the 1,200-
1,220 Ma class because the last class has cumulative frequency of 100 % that is not
part of the probability scale. One may omit plotting this last value but a slight
Search WWH ::




Custom Search