Databases Reference
In-Depth Information
C
k
= 3
indirect
max
direct
min
o
direct
max
indirect
min
Figure 12.9
A property of
LOF
.
o
/
.
and
indirect
max
.
o
/D
max
f
reachdist
k
.
o
00
o
0
/j
o
0
2
N
k
.
o
/
and
o
00
2
N
k
.
o
0
/g.
(12.18)
Then, it can be shown that
LOF
.
o
/
is bounded as
direct
min
.
o
/
direct
max
.
o
/
LOF
.
o
/
.
(12.19)
indirect
max
.
o
/
indirect
min
.
o
/
This result clearly shows that LOF captures the relative density of an object.
12.5
Clustering-Based Approaches
The notion of outliers is highly related to that of clusters. Clustering-based approaches
detect outliers by examining the relationship between objects and clusters. Intuitively,
an outlier is an object that belongs to a small and remote cluster, or does not belong to
any cluster.
This leads to three general approaches to clustering-based outlier detection. Consider
an object.
Does the object belong to any cluster? If not, then it is identified as an outlier.
Is there a large distance between the object and the cluster to which it is closest? If
yes, it is an outlier.
Is the object part of a small or sparse cluster? If yes, then all the objects in that cluster
are outliers.