Constructing multivariate distributions for soil parameters - Risk and Reliability in Geotechnical Engineering

Environmental Engineering Reference

In-Depth Information

normal vector as a generalization of the bivariate case coupled by a correlation matrix ,

(4) single non-normal random variable as a nonlinear transform of the normal random

variable, and (5) multivariate non-normal vector as a component-by-component nonlinear

transform of the multivariate normal case. No prior knowledge of probability and statistics

is required, but the reader may need to read standard texts for details. The emphasis in this

chapter is on how to use the theoretical tools to produce useful results in practice. In other

words, given a table of measured numbers (multivariate data), how would an engineer (1)

identify a reasonable probability model (“goodness-of-fit” problem) from data, (2) estimate

the model parameters (e.g., mean, COV) from data, (3) simulate “virtual site” data from

the probability model, and (4) draw useful engineering conclusions from the probability

model? The sample size of geotechnical data is typically small. Statistical uncertainties are

ubiquitous and play a significant role in practice. Complete multivariate data are also rarely

available. These aspects and other important limitations are comprehensively discussed to

ensure that the engineer is fully informed of the practical limits of statistical inference in

geotechnical engineering.

1.2 norMal ranDoM VarIable

1.2.1 random data

Random data can be viewed as a list of numbers taking a range of values and assuming a

different frequency of occurrences when plotted in the form of a histogram. Random data

can be modeled as a random variable following a cumulative distribution function (CDF).

The CDF can be presented in the form of its derivative for continuous variables. This deriva-

tive is called the probability density function (PDF).

It is crucial to distinguish between a random variable and a list of measured values, say a

list of undrained shear strength ( s u ) values obtained by performing unconfined compression

test on undisturbed samples. The former is a mathematical model. The latter is reality—

what you measure in practice. There are two challenges in linking what you measure to a

random variable.

First, the number of data points in a list of measurements (called sample size) must be inite .

It is relatively easy to simulate a finite list of values if the random variable is defined. For

example, if the undrained shear strength is normally distributed with a mean of 100 kPa and

a standard deviation of 20 kPa, we can obtain, say, 30 values using the MATLAB ® function

normrnd(100, 20, 30, 1). You can perform simulation using Data > Data Analysis > Random

Number Generation in EXCEL as well. It is important to note that the theoretical properties

such as the mean of a random variable can be obtained only from an infinite sample (called

a population). The arithmetic average obtained from a finite sample is called the “sample

mean.” In this chapter, the term “mean” is associated with a random variable while the

term “sample mean” is associated with a finite sample. The same terminology applies to

other properties also. It is possible to simulate different finite samples. Given the random

nature of the data, the sample mean computed from one sample will be different from the

sample mean computed from another sample. This phenomenon is called “statistical uncer-

tainty” and it is crucial to appreciate that all quantities estimated from a finite sample will

be subjected to this fundamental limitation. The upshot is that no theoretical properties can

be estimated with perfect precision. A point estimate will be implicitly associated with a

statistical error. It is arguably more accurate to report an estimate of a theoretical property

in the form of a confidence interval. An alternate method is to report the p -value associated

with a null hypothesis for a theoretical property. These concepts will be made specific in

Risk and Reliability in Geotechnical Engineering

Search WWH ::

Custom Search

Home