Databases Reference
In-Depth Information
Symmetry: d
.
i , j
/D d
.
j , i
/
: Distance is a symmetric function.
Triangle inequality: d
: Going directly from object i to object j
in space is no more than making a detour over any other object k .
.
i , j
/ d
.
i , k
/C d
.
k , j
/
A measure that satisfies these conditions is known as metric . Please note that the
non-negativity property is implied by the other three properties.
Example2.19 Euclidean distance and Manhattan distance. Let x 1 D (1, 2) and x 2 D (3, 5) repre-
se nt two objects as shown in Figure 2.23. The Euclidean distance between the two is
p 2 2 C3 2 D 3.61. The Manhattan distance between the two is 2C3 D 5.
Minkowski distance is a generalization of the Euclidean and Manhattan distances.
It is defined as
/D h q
j x i 1 x j 1 j h Cj x i 2 x j 2 j h CCj x ip x jp j h ,
d
.
i , j
(2.18)
where h is a real number such that h 1. (Such a distance is also called L p norm in
some literature, where the symbol p refers to our notation of h . We have kept p as
the number of attributes to be consistent with the rest of this chapter.) It represents
the Manhattan distance when h D 1 (i.e., L 1 norm) and Euclidean distance when h D 2
(i.e., L 2 norm).
The supremum distance (also referred to as L max , L 1 norm and as the Chebyshev
distance ) is a generalization of the Minkowski distance for h !1. To compute it, we
find the attribute f that gives the maximum difference in values between the two objects.
This difference is the supremum distance, defined more formally as:
0
1
1
h
p X
p
max f j x if x jf j.
@
j x if x jf j h
A
d
.
i , j
/D lim
h !1
D
(2.19)
f D1
The L 1 norm is also known as the uniform norm .
x 2 = (3, 5)
5
4
Euclidean distance
= (2 2 + 3 2 ) 1/2 = 3.61
3
3
Manhattan distance
= 2 + 3 = 5
Supremum distance
= 5 - 2 = 3
x 1 = (1, 2)
2
1
2
1
2
3
Figure2.23 Euclidean, Manhattan, and supremum distances between two objects.
 
Search WWH ::




Custom Search