Biomedical Engineering Reference
In-Depth Information
measured in high-dimensional space [ 14 ]. The new geometrical configuration of
points, which preserves the proximities of the high-dimensional space, facilitates the
perception of data's underlying structure and often makes it much easier to under-
stand. The problem addressed by MDS can be stated as follows: given n i items in a
m -dimensional space and an n i ×
n i matrix of proximity measures among the items,
MDS produces a p i -dimensional configuration X , p i
m , representing the items
such that the distances among the points in the new space reflect, with some degree
of fidelity, the proximities in the data. The proximity measures the (dis)similarities
among the items, and in general, it is a distance measure: the more similar two items
are, the smaller their distance is. The Minkowski distance metric provides a general
way to specify distance for quantitative data in a multi-dimensional space:
m
r 1 /r
d ij =
w k |
x ik
x jk |
(8.13)
k = 1
where m is the number of dimensions, x ik is the value of dimension k for object i
and w k is a weight. For w k =
1, with r =
2, the metric equals the Euclidean dis-
tance metric, while r =
1 leads to the city-block (or Manhattan) metric. In practice,
the Euclidean distance metric is generally used, but there are several other defini-
tions that can be applied, including for binary data [ 22 ]. Typically MDS is used to
transform the data into two or three dimensions for visualizing the result to uncover
data's hidden structure, but any p i <m is also possible. The geometrical repre-
sentation obtained with MDS is indeterminate with respect to translation, rotation,
and reflection [ 47 ]. There are two forms of MDS, namely the metric MDS and the
nonmetric MDS. The metric MDS uses the actual values of dissimilarities, while
nonmetric MDS effectively uses only their ranks. Metric MDS assumes that the dis-
similarities δ ij calculated in the original m -dimensional data and distances d ij in the
p i -dimensional space are related as follows:
d ij
f(δ ij ) (8.14)
where f is a continuous monotonic function. Metric (scaling) refers to the type of
transformation f of the dissimilarities and its form determines the MDS model. If
d ij
1) and a Euclidean distance metric is used we obtain the
classical (metric) MDS. In metric MDS the dissimilarities between all objects are
known numbers and they are approximated by distances. Thus objects are mapped
into a low-dimensional space, distances are calculated, and compared with the dis-
similarities. Then objects are moved in such way that the fit becomes better, until
an objective function (called stress function in the context of MDS) is minimized.
In nonmetric MDS, the metric properties of f are relaxed but the rank order of
the dissimilarities must be preserved. The transformation function f must obey the
monotonicity constraint d ij rs
= δ ij (it means f
=
f(δ rs ) for all objects. The advantage
of nonmetric MDS is that no assumptions need to be made about the underlying
transformation function f . Therefore, it can be used in situations that only the rank
order of dissimilarities is known (ordinal data). Additionally, it can be used in cases
where there is incomplete information. In such cases, the configuration X is con-
structed from a subset of the distances, and, at the same time, the other (missing)
f(d ij )
=
Search WWH ::




Custom Search