Environmental Engineering Reference
In-Depth Information
ratio E u / s u to obtain the rigidity index, which is another soil property. If you repeat this
procedure for different soil samples, you will obtain different pairs of values that can be
entered as two columns of numbers in EXCEL. However, it is crucial to know that each
row refers to one soil sample. Such a database is called bivariate data. Another method is
to extract s u from one soil sample and E u from a different soil sample. In other words, the
pair of values ( s u , E u ) is extracted from two different stress-strain curves or perhaps, more
commonly, one value is measured in the laboratory and the other value is measured in the
field away from the sampling location. This procedure will also produce two columns of
numbers in EXCEL. Such a database is called two sets of univariate data. The critical dif-
ference is that each pair of values in a bivariate data could be related in some way, because
they are merely different aspects of the same stress-strain curve. For univariate data, there
is no relation between s u and E u even if they appear in the same row, because they refer to
properties from two different soil samples. In fact, one can randomly shuffle the s u values
(or E u values) in one column and nothing changes because the s u value and the E u value
appearing in a single row are still associated with different soil samples. In summary, row
has no meaning for two sets of univariate data.
In contrast, it is critical to enter the pair of values ( s u , E u ) in the correct row for bivariate
data. It is more accurate to visualize bivariate data as the coordinates of a point in a scatter
plot than two columns of numbers. Consider the pair ( s u = 40 kPa, E u = 8 MPa) and the pair
( s u = 200 kPa, E u = 20 MPa). When the above pairs of values are plotted as points, say with
E u as the vertical axis and s u as the horizontal axis, it is clear that the points will be shifted
drastically when one column is shuffled creating two new pairs of values: ( s u = 200 kPa,
E u = 8 MPa) and ( s u = 40 kPa, E u = 20 MPa). One would also be changing the physical prop-
erty of the soil in a fundamental way in terms of the rigidity index. The original rigidity
indices are 100 and 200. The new rigidity indices are 40 and 500! Clearly, minor errors in
entering values are less significant than entering values in the incorrect row for bivariate
data.
The scatter plot provides a graphical overview of an important concept called “depen-
dency” between two variables. An actual example is shown in Figure 1.11 . The a and n
parameters control a nonlinear equation, called the van Genuchten model, which describes
how the volumetric water content of an unsaturated soil varies with the matrix suction. It
is quite clear that a downward trend exists, that is, a small value of a is associated with a
large value of n and vice versa. If you shuffle the columns containing these values, you may
produce a pair of values where both a and n are large. Such a data point does not exist in
practice and there are good physical reasons to explain this. In fact, the tight clustering of
data around a downward trend, which is a “signature” profile of the data in Figure 1.11 ,
will be lost when coordinates are randomized in the shuffling process. Hence, in contrast to
shuffling having no effect on two columns of univariate data, shuffling effectively destroys
two columns of bivariate data and the underlying physical basis.
It is easy to appreciate the concept of dependency in a visual way using a scatter plot.
However, it is less straightforward to characterize dependency quantitatively. Mathematically,
you would need a bivariate probability model, which can be expressed as a two-dimensional
PDF or CDF. The most common model is the bivariate normal model that is discussed in
detail below. Dependency is succinctly characterized by a single number called the Pearson
or product-moment correlation coefficient in this model. Nonetheless, it is important to
note that this model is not unique. Here, we do not mean that the marginal distribution of
each component (e.g., s u ) is non-normal. We will show below that it is always possible to
transform a non-normal component into a normal component. Nonetheless, two normal
components do not necessarily constitute a bivariate normal vector. It suffices to state here
that a random vector is bivariate normal (vis-à-vis univariate normal) if and only if all linear
Search WWH ::




Custom Search