Geoscience Reference
In-Depth Information
The univariate cdf of the RV Z( u ) is used to characterize
uncertainty about the value z( u ), and the multivariate cdf
is used to characterize joint uncertainty about the N values
z( u 1 ),…, z( u N ).
The bivariate (N = 2) cdf of any two RVs Z( u 1 ), Z( u 2 ), or
more generally Z( u 1 ), Y( u 2 ), is particularly important since
conventional geostatistical procedures are restricted to uni-
variate (F( u ;z)) and bivariate distributions:
statistical parameters at unsampled locations. The paradigm
underlying geostatistical inference is to trade the unavail-
able replication at location u for another replication avail-
able somewhere else in space and/or time. For example, the
cdf F( u ;z) may be inferred from the sampling distribution
of z-samples collected at other locations within the same
field.
This trade of replication corresponds to the decision of
stationarity. Stationarity is a property of the RF model, not of
the underlying physical spatial distribution. Thus, it cannot
be checked from data. The decision to pool data into statis-
tics across geologic units is not refutable a priori from data;
however, it can be shown inappropriate a posteriori if differ-
entiation of a domain significantly improves the inferences
and estimations obtained.
The RF {Z( u ), u in A} is said to be stationary within the
field A if its multivariate cdf is invariant under any transla-
tion of the N coordinate vectors u k , that is:
F
( , ; , )
uu
z
z
=
Prob Z
{( )
u
z
, ( )
Z
u
z
}
1 212
1
1
2
2
One important statistic of the bivariate cdf F( u 1 , u 2 ; z 1 , z 2 ) is
the covariance function defined as:
C
(
uu
,
)
=
EZ
{
(
u u
)
Z
(
)}
{
EZ
(
u
)} {
EZ
(
u
)}
12
1
2
1
2
The covariance is a summary statistic that is positive when
Z( u 1 ) and Z( u 2 ) are directly related and negative when they
are inversely related. The magnitude of the covariance sum-
marizes the strength of the relationship. It is a single num-
ber that summarizes the bivariate distribution. When a more
complete summary is needed, the bivariate cdf F( u 1 , u 2 ; z 1 ,
z 2 ) can be described by considering binary indicator trans-
forms for thresholds of the Z variable. Then, the previous
bivariate cdf at various thresholds z 1 and z 2 appears as the
non-centered covariance of the indicator variables:
F
(
……
= +… + …
uu
ul ul
,
,
;
z
,
,
z
)
1
N
1
N
F
(
,
,
;
z
,
,
z
)
for any vector
l
1
n
1
n
Invariance of the multivariate cdf entails invariance of any
lower order cdf, including the univariate and bivariate cdfs,
and invariance of all their moments. The decision of station-
arity allows inference. For example, the unique stationary
cdf
Fz
( )
=
F
( ;
u
z
), for all
u
in A
F
(, ;, )
uu
z z
=
EI
{(,)( , )}
u u
z I
z
1 21 2
11
2 2
can be inferred from the cumulative sample histogram of the
z -data values available at various locations within A. The
stationary mean and variance can then be calculated from
that stationary cdf F(  z ), and also the stationary covariance
can be inferred.
Stationarity is critical for the appropriateness and reliabil-
ity of geostatistical methods. Pooling data across geological
boundaries may mask important grade differences; on the
other hand, splitting the data into too many small stationary
subsets may lead to unreliable statistics based on too few
data per subset. The rule in statistical inference is to pool the
largest amount of relevant information to formulate predic-
tive statements (Chap. 4).
Since stationarity is a property of the RF model, the de-
cision of stationarity may change if the scale of the study
changes or if more data becomes available. If the goal of the
study is global, then local details may be less important; con-
versely, the more data available approaching final decisions
such as grade control or final mine design, the more statisti-
cally significant differentiation becomes possible.
Consider a stationary random function Z with known
mean m  and variance σ 2 . The mean and variance are inde-
This relation is important for the interpretation of the indi-
cator geostatistics formalism; it shows that the inference of
bivariate cdfs can be done through sample indicator covari-
ances.
The probability density (or mass) function (pdf) represen-
tation is more relevant for categorical variables. Recall that
a categorical variable Z( u ) may take one of K outcome val-
ues k = 1,…, K , arising from a naturally occurring categori-
cal variable or from a continuous variable discretized into
K classes.
Inference of any statistic requires some repetitive sam-
pling. For example, repetitive sampling of the variable z( u )
is needed to evaluate the cdf through experimental propor-
tions:
F
(;)
u
z
= ≤=
Prob Z
{ ()
u
z
}
Proportion z
{()
u
z
}
However, in almost all applications at most one sample
is available at any single location u in which case z( u ) is
known (ignoring sampling errors), and the need to consider
the RV model Z( u ) vanishes. The need remains to infer the
Search WWH ::




Custom Search