Information Technology Reference
In-Depth Information
The interval (2.22) provides us with a geometrical description of the data points about
their sample mean. If the data come from a normal distribution then approximately
91.67% of the data points will lie in the interval (2.22). If we have two samples and
the concentration interval for sample 1 is contained within the concentration interval for
sample 2 then we can say that the first sample is more concentrated about its mean than
the second sample.
Cramer extended the idea leading to (2.21) to the random vector
Y
p
with specified
µ
as expected vector and specified positive definite matrix
:
p
×
p
as covariance matrix
by considering the following question: what random vector
Y
p
has a density that is
uniformly distributed over the interior of a
p
-dimensional ellipsoid centred at
µ
such
that E
(
Y
p
)
=
µ
and cov
(
Y
p
)
=
? Making use of the integrals (Cramer, 1946)
p
/
2
c
p
√
|
|
π
···
dx
1
...
dx
p
=
+
1
)
·
,
(2.23)
p
2
(
x
x
<
c
2
p
/
2
c
p
√
|
|
c
2
π
2
·
ki
···
x
i
x
k
dx
1
...
dx
p
=
)
·
·
,
(2.24)
p
2
(
+
1
p
+
|
|
x
c
2
x
<
where
, together
with the properties of spherical and elliptical distributions (see, for example, Fang
et al
.,
1990), it can be shown that the random vector
Y
p
that is uniformly distributed over the
interior of the ellipsoid defined by
(
y
−
µ
)
−
1
ki
denotes the cofactor of
σ
ik
=
σ
ki
in
and
|
|
the determinant of
(
y
−
µ
)<
p
+
2,
(2.25)
(
y
−
µ
)
−
1
has a mean of
(
y
−
µ
)
=
p
+
2 is called a
concentration ellipsoid
. We observe that, for
p
=
1, equation (2.25)
reduces to the interval (2.22). If we have a random sample of size
n
from the distribution
of
Y
p
then we have the sample concentration ellipsoid defined by the locus of a point
y
:
p
×
µ
and a covariance matrix of
. The ellipsoid
1 satisfying
(
y
−
x
)
S
−
1
(
y
−
x
)
=
p
+
2,
(2.26)
where
x
and
S
are the usual (unbiased) estimates of
µ
and
, respectively. It is well
known that if
Y
p
has a
p
-variate normal
(
µ
;
)
distribution then
(
Y
p
−
µ
)
−
1
2
p
(
Y
p
−
µ
)
∼
χ
.
(2.27)
In the case of (2.27) it follows that
0
.
9167
for
p
=
1
P
{
(
Y
p
−
µ
)
−
1
(
Y
p
−
µ
)<
p
+
2
}=
0
.
8647
for
p
=
2
0
.
8282
for
p
=
3
.
Therefore, in the case of a two-dimensional configuration of points we expect the sample
concentration ellipse
to enclose approximately 86.5% of the data points. Now, in the
case of
p
=
2, write (2.26) as
(
y
−
x
)
S
−
1
2
(
y
−
x
)
=
κ
.
(2.28)