Constructing multivariate distributions for soil parameters - Risk and Reliability in Geotechnical Engineering

Environmental Engineering Reference

In-Depth Information

⋅ ≤≤+

⋅

(1.21)

0 025

0 975

where t 0.025 and t 0.975 are, respectively, the 0.025 and 0.975 percentiles of the Student's

t - distribution with n − 1 DOF.

For the sample standard deviation, s , if Y is indeed normally distributed (again, which may

not be true), the standardized ( n − 1) s 2 /σ 2 is distributed as the χ-squared distribution with

( n − 1) DOF. An empirical example of this χ-squared distribution with 9 DOFs is shown in

Figure 1.8b . One can then establish the 95% confidence interval of σ 2 by

(1.22)

(

−⋅ ≤≤−⋅

)

(

)

0 975

0 025

where χ 0 025

. are, respectively, the 0.025 and 0.975 percentiles of the χ-squared

distribution with (n − 1) DOF.

and χ 0 975

1.2.3.5.2 Bootstrapping

Equations 1.20 through 1.22 are based on the strong assumption that the data are nor-

mally distributed. In practice, we do not know the distribution of the data. We can test for

normality using the K-S test described below, but it would be convenient to obtain confi-

dence intervals for μ and σ without making an assumption on the distribution. The non-

parametric bootstrapping (Efron and Tibshirani 1993) is a general framework of obtaining

approximate samples from the sampling distribution of any statistics. Let the statistics

of interest be denoted by g(Y (1) , Y (2) , …, Y ( n ) ). For the sample mean, m, g(Y (1) , Y (2) , …,

Y ( n ) ) = (Y (1) + Y (2) + … + Y ( n ) )/ n . The steps for bootstrapping are as follows:

1. Resampling (Y (1) , Y (2) , …, Y ( n ) ) with replacement. Denote the resampled Y by (Y′ (1) , Y′ (2) ,

…, Y′ ( n ) ). It is noteworthy that after the resampling, there may be repetitive values in

(Y′ (1) , Y′ (2) , …, Y′ ( n ) ), because they are resampled with replacement.

2. Evaluate g(Y′ (1) , Y′ (2) , …, Y′ ( n ) ). This is a resampled g value.

3. Repeat steps 1 and 2 to obtain B resampled g value. Note that B is distinctive from n .

The B resampled g values can be viewed as approximate realizations of the sampling dis-

tribution of g.

Again, we initialized by randn('state', 13) before executing normrnd(100, 20, 10, 1). The

sample mean, m = 101.84 and the sample standard deviation, s = 23.82. These are the point

estimates for μ and σ. However, it is not clear how large the statistical uncertainties are. Figure

1.10 shows the histograms of B = 1000 resampled m values and s values (B = 1000) based on

the bootstrapping procedure. The 95% confidence intervals of μ and σ can be estimated as

the interval bounded by the 0.025 and 0.975 sample percentiles of the resampled values. This

confidence interval is called the 95% bootstrap confidence intervals, and this method is called

the percentile method (Efron 1981). For μ, the 95% bootstrap confidence interval is [88.37,

115.58]. This can be compared to the 95% analytical confidence interval [84.80, 118.88]

based on Equation 1.21 . For σ, the 95% bootstrap confidence interval is [15.51, 27.90]. This

can be compared to the 95% analytical confidence interval [16.38, 43.49] based on Equation

1.22 . The difference between the bootstrap and analytical confidence intervals is due to the

small sample n = 10. The problem of “insufficient coverage” for bootstrap confidence intervals

of σ was discussed in Schenker (1985): the probability for the bootstrap confidence interval to

cover the actual value of σ is lower than expected. This problem may occur when the sample

size (n) is small. The bootstrap method is based on the assumption that the discrete samples

Risk and Reliability in Geotechnical Engineering

Search WWH ::

Custom Search

Home