Environmental Engineering Reference
In-Depth Information
EXAMPLE 10.8
M
(
X
Np
Np
)
2
m
m
χ 2
=
It is proposed that the water-quality samples given in
Example 10.7 are drawn from a log-normal distribution
with a natural-log mean of 1.20 and a standard deviation
of 0.80. using the following categories of data, evaluate
whether the hypothesis of log-normality is supported at
the 5% significance level.
m
m
=
1
[
6 30 0 160
30 0 160
( .
)]
2
[
6 30 0 201
( .
)]
2
[
8 30 0 165
30 0 165
( .
)]
2
=
+
+
( .
)
30 0 201
( .
)
( .
)
2
2
[
5 30 0 170
30 0 170
( .
)]
[
5 30 0 304
30 0 304
( .
)]
+
+
( .
)
( .
)
=
4 042
.
Category
range (mg/l)
The confidence level corresponding to a chi-square
value of 4.042 with 4 degrees of freedom is 0.4004. Since
this confidence level is much greater than the 0.05 (i.e.,
5%), then the hypothesis that the data is drawn from a
log-normal distribution is accepted.
1
[0,1.5]
2
(1.5,2.5]
3
(2.5,3.5]
4
(3.5,5.0]
5
(5.0,∞]
10.5.2.2  Kolmogorov-Smirnov Test.  This test differs
from the chi-square test in that none of the parameters
from the theoretical probability distribution need to be
estimated from the observed data. In this sense, the KS
test is classified as a nonparametric test. The procedure
for implementing the KS test is as follows (Haan, 1977):
Solution
From the given data: μ y = 1.2, σ y = 0.8, and N = 30. The
analysis of the given data is summarized in the following
table:
Step 1. let P X ( x ) be the specified theoretical CDF
under the null hypothesis.
Step 2. let S N ( x ) be the sample CDF based on N
observations. For any observed x , S N ( x ) = k / N ,
where k is the number of observations less than or
equal to x .
Step 3. Determine the maximum deviation, D ,
defined by
c
m (mg/l) X ( m )
z
F ( z )
p m
1
[0,1.5]
6
[−∞,−0.993]
[0,0.160]
0.160
2
(1.5,2.5]
6
[−0.993,−0.355]
[0.160,0.361]
0.201
3
(2.5,3.5]
8
[−0.355,0.066]
[0.361,0.526]
0.165
4
(3.5,5.0]
5
[0.066,0.512]
[0.526,0.696]
0.170
5
5
[0.696,1.000]
0.304
(5.0,∞]
[0.512,−∞]
D
=
max
P x
( )
S x
( )
(10.55)
X
N
where m = category, c = measured concentrations,
X ( m ) = number of observations in category m , z =
standard normal deviate corresponding to c , F ( z ) =
CDF of z , and p m = probability of c occurring in cate-
gory m . The values of z and p m are calculated using the
relations
Step 4. If, for the chosen significance level, the
observed value of D is greater than or equal to the
critical value of the KS statistic tabulated in
Appendix C.5, the hypothesis is rejected.
An advantage of the KS test over the chi-square test
is that it is not necessary to divide the data into intervals;
thus, any error of judgment associated with the number
or size of the intervals is avoided (Haldar and Mahade-
van, 2000).
A minimum sample size of 50 is usually recom-
mended when using this test (McBean and rovers,
1998). The KS test is generally more efficient than the
chi-square test when the sample size is small (McCuen,
2002a). However, neither of these tests is very powerful
in the sense that the probability of accepting a false
hypothesis is quite high, especially for small samples
(Haan, 1977). Furthermore, probability distributions
that demonstrate a good fit with observed data might
ln
c
µ
ln
c
1 2
0 8
.
y
z
=
=
σ
.
y
m = (
) (
)
p
F z
F z
U
L
where z u and z l are the upper and lower values
of z that bound the given category. Since there are
five categories and the parameters of the distribution
were not estimated from the data, then the number
of degrees of freedom is 5 − 1 = 4. The chi-square
statistic given by Equation (10.54) is calculated as
follows
Search WWH ::




Custom Search