Environmental Engineering Reference
In-Depth Information
T
()
k
()
k
s
1
X
m
X
m
s
1
1
1
1
1
n
1
1
1
C
×
×
×
(1.57)
n
1
()
k
()
k
s
1
k
=
1
X
m
X
m
s
1
d
d
d
d
d
d
where m i and s i are the sample mean and sample standard deviation for Xi. i . Note that
the full multivariate dataset (X 1 , X 2 , …, X d ) is required for this method. This method
guarantees that the resulting C is at least semipositive-definite. The issue of positive
definiteness will be discussed later.
b. Entry-by-entry bivariate manner based on a bivariate dataset (Xi, i , X j ):
n
(
)
(
)
ij
1
(
n
1
)
X
()
k
m
X
()
k
m
ij
i
i
j
j
δ ij
k
=
1
(1.58)
n
(
)
n
(
)
2
2
ij
ij
()
k
()
k
1
(
n
1
)
X
m
×
1
(
n
1
)
X
m
ij
i
i
ij
j
j
k
=
1
k
=
1
where n ij is the number of the bivariate (Xi, i , X j ) data points. The benefit of this method
is that the full multivariate dataset (X 1 , X 2 , …, X d ) is not required. Only all possible
bivariate datasets (Xi, i , X j ) are needed.
To illustrate the first method, we simulate a full multivariate standard normal dataset (X 1 ,
X 2 , …, X d ) with d = 3 and sample size n = 10 using the procedure described in Section 1.4.4:
the random seed is initiated by the MATLAB function randn('state', 13). Three columns of
independent standard normal data are simulated using Z = normrnd(0, 1, n , 3). The upper
triangle Cholesky matrix is computed as u = chol( C ), where C is the correlation matrix of
(X 1 , X 2 , X 3 ). In the current case,
1 0 57 059
0571005
059005
.
.
C =
.
.
(1.59)
.
.
1
Then, three columns of X data are obtained from X T = Z T × u . The simulated data points
are shown in Table 1.9 . The data in Table 1.9 has full multivariate information because (X 1 ,
X 2 , X 3 ) are simultaneously known for each case. The first method implements Equation
1.57 to estimate C . The resulting C estimate is
1
0 713
.
0 824
.
C
0 713
.
1
0 445
.
(1.60)
0 824
.
0 445
.
1
The MATLAB function C = corr( X ) ( X is the 10 × 3 matrix shown in Table 1.9 ) will give
the same result as above.
To illustrate the second method, we simulate three bivariate standard normal datasets for
the pairs of (X 1 , X 2 ), (X 1 , X 3 ), and (X 2 , X 3 ) with n 12 = 10, n 13 = 11, and n 23 = 9. Each pair is
simulated independently of the other two pairs. Again, the random seed is initiated by the
MATLAB function randn('state', 13) before simulation is carried out. We obtain two col-
umns of independent standard normal data using Z = normrnd(0, 1, n 12 , 2). The correlation
 
 
Search WWH ::




Custom Search