Constructing multivariate distributions for soil parameters - Risk and Reliability in Geotechnical Engineering

Environmental Engineering Reference

In-Depth Information

the following sections, but it suffices to appreciate that the above statistical tools are needed

because the sample size is finite in practice. In geotechnical engineering, our sample sizes

are typically small and statistical uncertainty cannot be ignored. Simulation is an important

tool to study statistical uncertainty.

Second, it is important to appreciate that a list of “random” looking measurements does

not necessarily follow a random variable model. From the authors' experience in statistical

modeling of geotechnical engineering data, we have found this model adequate in the sense

of producing meaningful and useful results for practice. This chapter presupposes that a

random variable model is adequate for geotechnical engineering data. If one accepts this

leap of faith, the obvious follow-up question is which CDF would be appropriate. In view

of the finite sample size, this “goodness-of-fit” question cannot be resolved with certainty.

Some standard “goodness-of-fit” tests would be presented below, but one should be mindful

that it is not sufficient to find a good it for a list of measurements (say a column of numbers

in EXCEL). Geotechnical engineering data are multivariate in nature, for example, they

may measure several properties such as the undrained shear strength, natural water content,

Atterberg limits, and preconsolidation pressure from the same undisturbed sample. While

there is a wide choice of probability models to fit a single column of number (univariate

data), there is only one practical choice to fit multiple columns (multivariate data). This

choice involves a column-by-column nonlinear transformation of a multivariate normal

probability model. Because of this restriction, it is more convenient to choose a univari-

ate probability model that is a transformation of the standard normal model. The Johnson

system of distributions is generated by such a transformation and it is useful to start testing

goodness of fit using this system of distributions.

The above “need to know” concepts are explained and illustrated using simulated data

in the sections below. Simulated data are “perfect” in the sense that they are theoretically

derived from a fully defined random variable. Hence, in contrast to actual data, there is no

question that a random variable model works! In addition, it is useful to compare statistics

computed from a finite sample size with the theoretical answers, which are also known since

the random variable is fully defined.

1.2.2 normal random variable

The normal distribution is also called the Gaussian distribution. Symbolically, “Y ~ N(μ, σ 2 )”

means that Y is normally distributed with mean, μ, and standard deviation, σ. The normal

distribution is the most important distribution in characterizing a physical parameter that

can take a range of values with a different likelihood of occurrences. Its importance will be

apparent in the context of non-normal multivariate distributions discussed in Section 1.6.

The concepts discussed below are illustrated using normally distributed undrained shear

strength values with mean of 100 kPa and standard deviation of 20 kPa, unless stated oth-

erwise. These values were simulated using the MATLAB function normrnd. The reader can

reproduce the data by initializing the pseudorandom sequence using randn('state', 13).

1.2.2.1 Probability density function

The PDF for the normal distribution is

(

)

2

−−

y

µ

σ

1

() =

fy

exp

(1.1)

2

⋅

2

πσ

⋅

Risk and Reliability in Geotechnical Engineering

Search WWH ::

Custom Search

Home