Civil Engineering Reference
In-Depth Information
In statistical estimation of probability density functions in high dimensions, it is
more convenient to first estimate cross sections as in the excess mass approach of
Hartigan ( 1987 ). Specifically, let f : R d
R + be the unknown density of a random
vector X ;then for each
α >
0, the
α
-level set of f is
R d
A α (
f
)= {
x
: f
(
x
) α }.
(1)
From the knowledge of all level sets, f is recovered as
f
(
x
)=
1 A α (
x
)
d
α .
(2)
0
Thus, an estimation strategy is first estimate the sets A α
from, say, a random
sample X 1 ,
X 2 ,...,
X n drawn from X , by a random set A α , n (
X 1 ,
X 2 ,...,
X n )
,thenuse
the plug-in estimator
f n (
x
)=
1 A α , n (
x
)
d
α
(3)
0
(
)
to estimate (pointwise) f
.
Without going into details how to construct a random set estimator, we merely
say that set estimators like A α , n
x
(
,
,...,
)
are random sets.
In a somewhat “hidden” way, random sets appear in robust Bayesian statistics or
in incomplete model specifications. The situation is this. The model is a probability
measure P o on
X 1
X 2
X n
( Ω ,A )
P
which is known only to belong to some known set
of
( Ω ,A )
probability measures on
. Without knowing P o , statisticians are forced to
=
{
∈ P}
{
∈ P}
work with F
. While clearly, the set
function F is not necessarily additive (i.e., not a probability measure), it satisfies a
weaker condition, namely,
inf
P : P
or its dual sup
P : P
i
= I ⊆{ 1 , 2 ,..., n }
) | I | + 1 F
(
)
(
(
) .
F
1 A i
1
I A j
(4)
j
=
There are situations (such as when
Ω
is finite) where we can “inverse” F to
obtain
)= B A ( 1 ) | A \ B | F ( B )
f
(
A
(5)
which is nonnegative and
1. As such, the set function f is qualified as
a bona fide probability density function, taking values, not as points in
A f
(
A
)=
Ω
,butas
subsets of
, i.e., f is the probability density of a random set.
Another important situation concerns coarse data (low quality). Data can happen
to be imprecise, e.g., due to the imperfection of the data acquiring procedure
(inaccuracy of measurement instruments), and as such they are rather sets than
points in the sample space. In such cases, rather than trying to ascribe unique
values to the imprecise observations, it is preferable to represent the outcomes of
the random experience or phenomenon as subsets containing the “true” values.
For example, missing data, censored or grouped data belong to the category of
coarse data.
Ω
Search WWH ::




Custom Search