Databases Reference
In-Depth Information
C
k = 3
indirect max
direct min
o
direct max
indirect min
Figure 12.9 A property of LOF
.
o
/
.
and
indirect max .
o
/D max f reachdist k .
o 00 o 0
/j o 0 2 N k .
o
/
and o 00 2 N k .
o 0
/g.
(12.18)
Then, it can be shown that LOF
.
o
/
is bounded as
direct min .
o
/
direct max .
o
/
LOF
.
o
/
.
(12.19)
indirect max .
o
/
indirect min .
o
/
This result clearly shows that LOF captures the relative density of an object.
12.5 Clustering-Based Approaches
The notion of outliers is highly related to that of clusters. Clustering-based approaches
detect outliers by examining the relationship between objects and clusters. Intuitively,
an outlier is an object that belongs to a small and remote cluster, or does not belong to
any cluster.
This leads to three general approaches to clustering-based outlier detection. Consider
an object.
Does the object belong to any cluster? If not, then it is identified as an outlier.
Is there a large distance between the object and the cluster to which it is closest? If
yes, it is an outlier.
Is the object part of a small or sparse cluster? If yes, then all the objects in that cluster
are outliers.
Search WWH ::




Custom Search