Databases Reference
In-Depth Information
q
m
r
p
s
o
Figure10.14 Density-reachability and density-connectivity in density-based clustering. Source: Based on
Ester, Kriegel, Sander, and Xu [EKSX96].
“How does DBSCAN find clusters?” Initially, all objects in a given data set D are
marked as “ unvisited .” DBSCAN randomly selects an unvisited object p , marks p as
visited ,” and checks whether the
-neighborhood of p contains at least MinPts objects.
If not, p is marked as a noise point. Otherwise, a new cluster C is created for p , and all
the objects in the
-neighborhood of p are added to a candidate set, N . DBSCAN iter-
atively adds to C those objects in N that do not belong to any cluster. In this process,
for an object p 0 in N that carries the label “ unvisited ,” DBSCAN marks it as “ visited ” and
checks its
-neighborhood of p 0 has at least MinPts objects, those
-neighborhood. If the
-neighborhood of p 0 are added to N . DBSCAN continues adding objects
to C until C can no longer be expanded, that is, N is empty. At this time, cluster C is
completed, and thus is output.
To find the next cluster, DBSCAN randomly selects an unvisited object from the
remaining ones. The clustering process continues until all objects are visited. The
pseudocode of the DBSCAN algorithm is given in Figure 10.15.
If a spatial index is used, the computational complexity of DBSCAN is O
objects in the
.
n log n
/
,
n 2
where n is the number of database objects. Otherwise, the complexity is O
.
/
. With
appropriate settings of the user-defined parameters,
and MinPts , the algorithm is
effective in finding arbitrary-shaped clusters.
10.4.2 OPTICS:OrderingPointstoIdentify
theClusteringStructure
Although DBSCAN can cluster objects given input parameters such as
(the maxi-
mum radius of a neighborhood) and MinPts (the minimum number of points required
in the neighborhood of a core object), it encumbers users with the responsibility of
selecting parameter values that will lead to the discovery of acceptable clusters. This is
a problem associated with many other clustering algorithms. Such parameter settings
 
Search WWH ::




Custom Search