Databases Reference
In-Depth Information
Let's look at examples of each of these approaches.
Example 12.15 Detecting outliers as objects that do not belong to any cluster. Gregarious animals
(e.g., goats and deer) live and move in flocks. Using outlier detection, we can iden-
tify outliers as animals that are not part of a flock. Such animals may be either lost or
wounded.
In Figure 12.10, each point represents an animal living in a group. Using a density-
based clustering method, such as DBSCAN, we note that the black points belong to
clusters. The white point, a , does not belong to any cluster, and thus is declared an
outlier.
The second approach to clustering-based outlier detection considers the distance
between an object and the cluster to which it is closest. If the distance is large, then
the object is likely an outlier with respect to the cluster. Thus, this approach detects
individual outliers with respect to clusters.
Example 12.16 Clustering-based outlier detection using distance to the closest cluster. Using the
k -means clustering method, we can partition the data points shown in Figure 12.11 into
three clusters, as shown using different symbols. The center of each cluster is marked
with a C.
For each object, o , we can assign an outlier score to the object according to the dis-
tance between the object and the center that is closest to the object. Suppose the closest
center to o is c o ; then the distance between o and c o is dist ( o , c o ), and the average
a
Figure 12.10 Object a is an outlier because it does not belong to any cluster.
a
b
c
Cluster centers
Figure 12.11 Outliers ( a , b , c ) are far from the clusters to which they are closest (with respect to the cluster
centers).
 
Search WWH ::




Custom Search