Databases Reference
In-Depth Information
Symmetry:
d
.
i
,
j
/D
d
.
j
,
i
/
: Distance is a symmetric function.
Triangle inequality:
d
: Going directly from object
i
to object
j
in space is no more than making a detour over any other object
k
.
.
i
,
j
/
d
.
i
,
k
/C
d
.
k
,
j
/
A measure that satisfies these conditions is known as
metric
. Please note that the
non-negativity property is implied by the other three properties.
Example2.19
Euclidean distance and Manhattan distance.
Let
x
1
D (1, 2) and
x
2
D (3, 5) repre-
se
nt two
objects as shown in Figure 2.23. The Euclidean distance between the two is
p
2
2
C3
2
D 3.61. The Manhattan distance between the two is 2C3 D 5.
Minkowski distance
is a generalization of the Euclidean and Manhattan distances.
It is defined as
/D
h
q
j
x
i
1
x
j
1
j
h
Cj
x
i
2
x
j
2
j
h
CCj
x
ip
x
jp
j
h
,
d
.
i
,
j
(2.18)
where
h
is a real number such that
h
1. (Such a distance is also called
L
p
norm
in
some literature, where the symbol
p
refers to our notation of
h
. We have kept
p
as
the number of attributes to be consistent with the rest of this chapter.) It represents
the Manhattan distance when
h
D 1 (i.e.,
L
1
norm) and Euclidean distance when
h
D 2
(i.e.,
L
2
norm).
The
supremum distance
(also referred to as
L
max
,
L
1
norm
and as the
Chebyshev
distance
) is a generalization of the Minkowski distance for
h
!1. To compute it, we
find the attribute
f
that gives the maximum difference in values between the two objects.
This difference is the supremum distance, defined more formally as:
0
1
1
h
p
X
p
max
f
j
x
if
x
jf
j.
@
j
x
if
x
jf
j
h
A
d
.
i
,
j
/D
lim
h
!1
D
(2.19)
f
D1
The
L
1
norm is also known as the
uniform norm
.
x
2
= (3, 5)
5
4
Euclidean distance
= (2
2
+ 3
2
)
1/2
= 3.61
3
3
Manhattan distance
= 2 + 3 = 5
Supremum distance
= 5 - 2 = 3
x
1
= (1, 2)
2
1
2
1
2
3
Figure2.23
Euclidean, Manhattan, and supremum distances between two objects.