Environmental Engineering Reference
In-Depth Information
Solution
10.10.1 Correlation
The correlation coefficient, ρ xy , between two variables,
x and y , is defined as
From the given data, N = 10, and the following slope
estimates are determined by differencing the data in
accordance with Equation (10.118):
σ
σσ
xy
ρ
=
(10.120)
xy
x
y
j
where σ xy is the covariance between x and y , and σ x and
σ y are the standard deviations of x and y , respectively.
The sample estimate of ρ xy is commonly denoted by r xy ,
which is calculated from the sample data using the
relation
2
3
4
5
6
7
8
9
10
k 1 −3.29 −1.10 −0.42 0.26 −0.39 0.33 0.31 0.28 0.38
2
1.10
1.01 1.44
0.33 1.05 0.91 0.79 0.84
3
0.92 1.61
0.07 1.04 0.88 0.74 0.80
4
2.30 −0.35 1.08 0.87 0.70 0.78
5
−3.00 0.48 0.39 0.31 0.48
N
6
3.95 2.08 1.41 1.35
(
)
(
)
x
x y
y
i
i
7
0.22 0.14 0.49
i
=
1
r
=
(10.121)
8
0.05 0.62
xy
1
2
1
2
N
N
9
1.19
(
)
2
(
)
2
x
x
y
y
i
i
i
=
1
i
=
1
These results for Q i yield M = 45 estimates of the the
slope with a median value of Q med = 0.70 (mg/l)/yr. To
determine the 95% confidence interval, take α = 0.05,
which corresponds to a standard normal deviate
z α /2 = 1.960. The following parameters can be calculated
in accordance with the Sen method:
The correlation coefficient, r xy , is usually denoted
simply by r , and is sometimes referred to as the Pearson
product moment correlation coefficient. . Values of r xy can
be anywhere in the range of [−1, 1]. When the popula-
tion correlation coefficient, ρ xy , is zero, it can be shown
that the statistic t * defined as
1
18
1
18
σ S
=
N N
(
1 2
)(
N
+
5
)
=
(
10 10 1 2 10
)[
][ (
)
+
5
]
N
1
t
*
=
r
(10.122)
xy
2
=
11 18
.
1
r
xy
C z
=
α σ
=
( .
1 960 11 18
)(
.
)
=
21 91
.
/
2
S
has a t distribution with N − 2 degrees of freedom, pro-
vided that both x and y are normally distributed. limit-
ing values of r xy corresponding to various N values for
α = 0.05 are given in Table 10.9, where it is apparent that
values of r xy much higher than zero are usually neces-
sary to show significant correlation. For large values of
N , the t distribution closely approximates the normal
distribution and limiting values of r xy at α = 0.05 can be
approximated by
1
2
1
2
(
) =
(
) =
M
=
M C
45 21 91
.
11 54
.
1
α
1
2
1
2
(
) =
(
) =
M
=
M C
+
45 21 91
+
.
33 46
.
2
α
The 11th and 12th ranked values of Q i are 0.26 and 0.28,
respectively, so the value of Q i with a rank of 11.54 is
interpolated as 0.27. Similarly, the 33rd and 34th ranked
values of Q i are 1.01 and 1.04, respectively, so the value
of Q i with a rank of 33.46 is interpolated as 1.03. There-
fore, the 95% confidence interval of the estimated slope
(= 0.70) is [0.27,1.03]. This confidence interval further
supports the assertion that the slope is significantly
nonzero.
xy = ± 1 96
.
r
(10.123)
N
TABLE 10.9. Limiting
Values of r xy for Zero
Correlation at α = 0.05
N
r xy
5
±0.75
10.10 RELATIONSHIPS BETWEEN
VARIABLES
10
±0.58
20
±0.42
30
±0.35
It is sometimes necessary to assess and evaluate the
relationships between variables. This is commonly done
using correlation and regression analyses.
50
±0.27
100
± 0.20
 
Search WWH ::




Custom Search