Databases Reference
In-Depth Information
10.4 Density-BasedMethods
Partitioning and hierarchical methods are designed to find spherical-shaped clusters.
They have difficulty finding clusters of arbitrary shape such as the “S” shape and oval
clusters in Figure 10.13. Given such data, they would likely inaccurately identify convex
regions, where noise or outliers are included in the clusters.
To find clusters of arbitrary shape, alternatively, we can model clusters as dense
regions in the data space, separated by sparse regions. This is the main strategy behind
density-based clustering methods , which can discover clusters of nonspherical shape.
In this section, you will learn the basic techniques of density-based clustering by
studying three representative methods, namely, DBSCAN (Section 10.4.1), OPTICS
(Section 10.4.2), and DENCLUE (Section 10.4.3).
10.4.1 DBSCAN:Density-BasedClusteringBasedonConnected
RegionswithHighDensity
“How can we find dense regions in density-based clustering?” The density of an object o
can be measured by the number of objects close to o . DBSCAN (Density-Based Spatial
Clustering of Applications with Noise) finds core objects , that is, objects that have dense
neighborhoods. It connects core objects and their neighborhoods to form dense regions
as clusters.
“How does DBSCAN quantify the neighborhood of an object?” A user-specified para-
meter
>
0 is used to specify the radius of a neighborhood we consider for every object.
The
-neighborhood of an object o is the space within a radius
centered at o .
, the density of a neighbor-
hood can be measured simply by the number of objects in the neighborhood. To deter-
mine whether a neighborhood is dense or not, DBSCAN uses another user-specified
Due to the fixed neighborhood size parameterized by
Figure10.13 Clusters of arbitrary shape.
 
Search WWH ::




Custom Search