Information Technology Reference
In-Depth Information
Ta b l e 9 . 1
Example data set for both
continuous and categorical data.
Subject
Height
Eye colour
a
172
blue
b
171
brown
c
175
green
d
168
brown
d
b
a
c
165
167
169
171
173
175
177
179
Figure 9.1
Graphical representation of height and eye colour data of Table 9.1.
9.2 Calculating inter-sample distances
In the nonlinear biplot described in Chapter 5 the matrix
X
:
n
p
represents
n
obser-
vations on
p
variables and
d
ij
denotes the distance between observations
x
i
and
x
j
.The
matrix
D
={−
×
2
d
ij
}
of
ddistances
forms the basis of the nonlinear biplot.
As in Chapter 5, it will be assumed that the matrix
D
is calculated with additive
Euclidean embeddable distance measures. The
ddistances
can therefore be expressed in
the form
1
p
1
2
d
ij
−
=
f
k
(
x
ik
,
x
jk
).
(9.1)
k
=
1
For continuous variables, any additive Euclidean embeddable distance measure can be
used. For example, in Chapter 5 examples of the function
f
k
(
x
ik
,
x
jk
)
for the
k
th variable
are defined as follows: for Pythagorean distance,
1
2
(
x
ik
−
x
jk
)
2
f
k
(
x
ik
,
x
jk
)
=−
;
(9.2)