Information Technology Reference
In-Depth Information
The interval (2.22) provides us with a geometrical description of the data points about
their sample mean. If the data come from a normal distribution then approximately
91.67% of the data points will lie in the interval (2.22). If we have two samples and
the concentration interval for sample 1 is contained within the concentration interval for
sample 2 then we can say that the first sample is more concentrated about its mean than
the second sample.
Cramer extended the idea leading to (2.21) to the random vector Y p with specified
µ
as expected vector and specified positive definite matrix : p × p as covariance matrix
by considering the following question: what random vector Y p has a density that is
uniformly distributed over the interior of a p -dimensional ellipsoid centred at
µ
such
that E ( Y p ) = µ and cov ( Y p ) = ? Making use of the integrals (Cramer, 1946)
p / 2
c p
| |
π
···
dx 1 ... dx p =
+ 1 ) ·
,
(2.23)
p
2
(
x x < c 2
p / 2
c p
| |
c 2
π
2 · ki
···
x i x k dx 1 ... dx p =
) ·
·
,
(2.24)
p
2
(
+
1
p
+
| |
x
c 2
x
<
where
, together
with the properties of spherical and elliptical distributions (see, for example, Fang et al .,
1990), it can be shown that the random vector Y p that is uniformly distributed over the
interior of the ellipsoid defined by
( y µ ) 1
ki denotes the cofactor of
σ ik
= σ ki in
and
| |
the determinant of
( y µ )< p + 2,
(2.25)
( y µ ) 1
has a mean of
( y µ ) =
p + 2 is called a concentration ellipsoid . We observe that, for p = 1, equation (2.25)
reduces to the interval (2.22). If we have a random sample of size n from the distribution
of Y p then we have the sample concentration ellipsoid defined by the locus of a point
y : p ×
µ
and a covariance matrix of
. The ellipsoid
1 satisfying
( y x ) S 1
( y x ) = p + 2,
(2.26)
where x and S are the usual (unbiased) estimates of µ and , respectively. It is well
known that if Y p has a p -variate normal ( µ ; ) distribution then
( Y p µ ) 1
2
p
( Y p µ ) χ
.
(2.27)
In the case of (2.27) it follows that
0 . 9167
for p = 1
P { ( Y p µ ) 1
( Y p µ )< p + 2 }=
0 . 8647
for p = 2
0 . 8282
for p = 3 .
Therefore, in the case of a two-dimensional configuration of points we expect the sample
concentration ellipse to enclose approximately 86.5% of the data points. Now, in the
case of p
=
2, write (2.26) as
( y x ) S 1
2
( y x ) = κ
.
(2.28)
Search WWH ::




Custom Search