Environmental Engineering Reference
In-Depth Information
TABLE 10.2. Cumulative Distribution Function of Concentration Measurements
c
c
c
rank
F ( c )
(mg/l)
rank
F ( c )
(mg/l)
rank
F ( c )
(mg/l)
1
0.980
15.49
11
0.649
3.29
21
0.318
1.98
2
0.947
14.30
12
0.616
3.26
22
0.285
1.83
3
0.914
14.30
13
0.583
3.14
23
0.252
1.69
4
0.881
11.57
14
0.550
2.89
24
0.219
1.63
5
0.848
6.41
15
0.517
2.72
25
0.185
1.34
6
0.815
4.91
16
0.483
2.67
26
0.152
1.17
7
0.781
4.88
17
0.450
2.52
27
0.119
1.10
8
0.748
4.71
18
0.417
2.52
28
0.086
1.02
9
0.715
4.28
19
0.384
2.48
29
0.053
0.78
10
0.682
4.00
20
0.351
2.20
30
0.020
0.39
The CDF for a log-normal distribution can be
expressed as
10.5.2 Comparisons of Probability Distributions
In addition to the visual comparison of the sample prob-
ability distribution with various theoretical probability
distributions, quantitative comparisons are also made
using hypothesis tests. The two most common hypoth-
esis tests for assessing whether an observed probability
distribution can be approximated by a given (theoreti-
cal) population distribution are the chi-square test and
the Kolmogorov-Smirnov ( KS) test .
ln
c
µ
ln
c
µ
σ
1
2
1
2
y
y
F c
( )
=
Φ
or
F c
( )
=
+
erf
σ
2
2
y
y
where Φ(·) represents the CDF of the standard normal
deviate. Evaluation of the log-normal CDF can be easily
done using built-in statistical functions found in most
spreadsheet and statistical-analysis programs. The
sample values of F ( c ) are compared with the theoretical
values of F ( c ) (with μ y = 1.20 and σ y = 0.80) in Figure
10.7. Based on this comparison, it is apparent that the
sample values of F ( c ) are consistently greater than the
theoretical values of F ( c ) for c < 7 mg/l. Therefore,
based on this visual comparison, there does not seem to
be a close fit between the sample distribution and the
proposed log-normal population distribution. However,
a quantitative comparison using a statistical measure
should be performed prior to drawing any statistically
significant conclusion.
10.5.2.1  The  Chi-Square  Test.  Based on sampling
theory, it is known that if the N outcomes are divided
into M classes, with X m being the number of outcomes
in class m , and p m being the theoretical probability
of an outcome being in class m , then the random
variable,
M
(
X
Np
Np
)
2
χ 2
m
m
=
(10.54)
m
m
=
1
has a chi-square distribution. The number of degrees of
freedom is M − 1 if the expected frequencies can be
computed without having to estimate the population
parameters from the sample statistics, while the number
of degrees of freedom is M − 1 − n if the expected fre-
quencies are computed by estimating n population
parameters from sample statistics. In applying the chi-
square goodness of fit test, the null hypothesis is that
the samples are drawn from the proposed population
probability distribution. The null hypothesis is accepted
at the α significance level if χ
1.0
0.8
sample
distribution
0.6
lognormal distribution
( m y = 1.2, s y = 0.8)
0.4
0.2
2
∈[ ,
0
χ α
2
] and rejected
otherwise.
The effectiveness of the chi-square test is diminished
if both the number of data intervals, called cells, is less
than 5, and the expected number of outcomes in any cell
is less than 5 (Haldar and Mahadevan, 2000; McCuen,
2002a).
0.0
0
2
4
6
8
10
12
14
16
18
20
c (mg/L)
Figure 10.7. Comparison of sample and log-normal dis-
tribution.
 
Search WWH ::




Custom Search