Biology Reference
In-Depth Information
5.2.1.2. Mahalanobis Distance
Mahalanobis distance takes the different variances in different features and the
possible correlation between any two features into consideration.
The Maha-
lanobis distance between any two objects x i and x j is:
x j ) Σ 1 ( x i
x j )] 1 / 2
d M ( x i , x j )=[( x i
(5.3)
where Σ is the sample variance-covariance matrix.
For example, in Fig. 5.1,
x C 1 ) Σ 1
x C 1 )] 1 / 2 , d M ( x P 1 , x C 2 )= x P 1
d M ( x P 1 , x C 1 )= x P 1
C 1 ( x P 1
x C 2 ) Σ 1
x C 2 )] 1 / 2 ,MatrixΣ C j is the sample variance-covariance matrix of
objects assigned to cluster centered at x C j , j =1 and 2. The distance d M ( x P 1 , x C 1 ) <
d M ( x P 1 , x C 2 ) , so point P 1 is assigned to cluster centered at C 1 correctly. Similarly,
point P 2 is assigned to cluster centered at C 2 correctly.
Mahalanobis distance is very analogous to Hotelling's T 2 statistic. Hotelling's
T 2 statistic is a very popularly used multivariate statistic to measure the weighted
distance between a high-dimension point to a population center [23]. We introduce
Hotelling's T 2 statistic here briefly to help understand the Mahalanobis distance.
Hotelling's T 2 statistic is calculated as:
C 2 ( x P 1
T 2 =[( x i
x ) S 1 ( x i
x )]
(5.4)
where x is the sample mean and S is the sample variance-covariance matrix. If
point x i has high T 2 statistic, it means with low probability, x i is generated from an
underlying population whose probability density function (pdf) has sample mean
x and sample variance-covariance matrix S . The readers can find the analogy be-
tween Eqs. 5.3 and 5.4. Assigning a point to a cluster whose center has the
minimum Mahalanobis distance to the point is just assigning a point to a popu-
lation where the point has the minimum Hotelling's T 2 statistic, i.e., the highest
probability that the point is generated by that population.
Users can also find the similarity between Mahalanobis distance and the likeli-
hood value of the observation under the assumption that the observation is a sam-
ple from a multivariate normal distribution. For instance, in Fig. 5.1, the likelihood
value of an observation to a multivariate normal distribution, with sample mean x C j
and sample variance-covariance matrix Σ C j is: L j ( x )=
1
1
(2 π ) p / 2 | Σ C j | 1 / 2 exp (
2 ( x
x C j ) Σ 1
C j ( x
x C j )), j =1, 2. Taking Eq. 5.3 into consideration, we get
1
1
2 d M ( x , x C j )) ,
L j ( x )=
1 / 2 exp (
(5.5)
(2 π ) p / 2
|
Σ C j |
Term
in the right hand side of Eq. 5.5 is the determinant of matrix Σ C j .
It is also called generalized variance [17]. From Eqs. 5.3 and
|
Σ C j |
5.5, we can see
Search WWH ::




Custom Search