Biplot basics - Understanding Biplots

Information Technology Reference

In-Depth Information

The interval (2.22) provides us with a geometrical description of the data points about

their sample mean. If the data come from a normal distribution then approximately

91.67% of the data points will lie in the interval (2.22). If we have two samples and

the concentration interval for sample 1 is contained within the concentration interval for

sample 2 then we can say that the first sample is more concentrated about its mean than

the second sample.

Cramer extended the idea leading to (2.21) to the random vector Y p with specified

as expected vector and specified positive definite matrix : p × p as covariance matrix

by considering the following question: what random vector Y p has a density that is

uniformly distributed over the interior of a p -dimensional ellipsoid centred at

such

that E ( Y p ) = µ and cov ( Y p ) = ? Making use of the integrals (Cramer, 1946)

p / 2

c p

√ | |

···

dx 1 ... dx p =

+ 1 ) ·

(2.23)

(

x x < c 2

p / 2

c p

√ | |

c 2

2 · ki

···

x i x k dx 1 ... dx p =

) ·

(2.24)

(

| |

c 2

where

, together

with the properties of spherical and elliptical distributions (see, for example, Fang et al .,

1990), it can be shown that the random vector Y p that is uniformly distributed over the

interior of the ellipsoid defined by

( y − µ ) − 1

ki denotes the cofactor of

σ ik

= σ ki in

and

| |

the determinant of

( y − µ )< p + 2,

(2.25)

( y − µ ) − 1

has a mean of

( y − µ ) =

p + 2 is called a concentration ellipsoid . We observe that, for p = 1, equation (2.25)

reduces to the interval (2.22). If we have a random sample of size n from the distribution

of Y p then we have the sample concentration ellipsoid defined by the locus of a point

y : p ×

and a covariance matrix of

. The ellipsoid

1 satisfying

( y − x ) S − 1

( y − x ) = p + 2,

(2.26)

where x and S are the usual (unbiased) estimates of µ and , respectively. It is well

known that if Y p has a p -variate normal ( µ ; ) distribution then

( Y p − µ ) − 1

( Y p − µ ) ∼ χ

(2.27)

In the case of (2.27) it follows that

0 . 9167

for p = 1

P { ( Y p − µ ) − 1

( Y p − µ )< p + 2 }=

0 . 8647

for p = 2

0 . 8282

for p = 3 .

Therefore, in the case of a two-dimensional configuration of points we expect the sample

concentration ellipse to enclose approximately 86.5% of the data points. Now, in the

case of p

2, write (2.26) as

( y − x ) S − 1

( y − x ) = κ

(2.28)

Understanding Biplots

Search WWH ::

Custom Search

Home