Databases Reference
In-Depth Information
of all objects under analysis and represents the density-based clustering structure of the
data. Objects in a denser cluster are listed closer to each other in the cluster ordering.
This ordering is equivalent to density-based clustering obtained from a wide range of
parameter settings. Thus, OPTICS does not require the user to provide a specific density
threshold. The cluster ordering can be used to extract basic clustering information (e.g.,
cluster centers, or arbitrary-shaped clusters), derive the intrinsic clustering structure, as
well as provide a visualization of the clustering.
To construct the different clusterings simultaneously, the objects are processed in a
specific order. This order selects an object that is density-reachable with respect to the
lowest
) will be finished first. Based
on this idea, OPTICS needs two important pieces of information per object:
value so that clusters with higher density (lower
0 such
The
core-distance
of
an
object p is
the
smallest
value
that
the
0 is the minimum dis-
tance threshold that makes p a core object. If p is not a core object with respect to
0 -neighborhood of p has at least MinPts objects. That is,
and MinPts , the core-distance of p is undefined.
The reachability-distance to object p from q is the minimum radius value that makes
p density-reachable from q . According to the definition of density-reachability, q
has to be a core object and p must be in the neighborhood of q . Therefore, the
reachability-distance from q to p is max f core-distance ( q ), dist ( p , q
/g. If q is not a
core object with respect to
and MinPts , the reachability-distance to p from q is
undefined.
An object p may be directly reachable from multiple core objects. Therefore, p
may have multiple reachability-distances with respect to different core objects. The
smallest reachability-distance of p is of particular interest because it gives the shortest
path for which p is connected to a dense cluster.
Example10.8 Core-distance and reachability-distance. Figure 10.16 illustrates the concepts of core-
distance and reachability-distance. Suppose that
D 6 mm and MinPts D 5. The core-
0 , between p and the fourth closest data object from p .
The reachability-distance of q 1 from p is the core-distance of p (i.e.,
distance of p is the distance,
0 D 3 mm) because
this is greater than the Euclidean distance from p to q 1 . The reachability-distance of q 2
with respect to p is the Euclidean distance from p to q 2 because this is greater than the
core-distance of p .
OPTICS computes an ordering of all objects in a given database and, for each object
in the database, stores the core-distance and a suitable reachability-distance. OPTICS
maintains a list called OrderSeeds to generate the output ordering. Objects in Order-
Seeds are sorted by the reachability-distance from their respective closest core objects,
that is, by the smallest reachability-distance of each object.
OPTICS begins with an arbitrary object from the input database as the current
object, p . It retrieves the
-neighborhood of p , determines the core-distance, and sets
the reachability-distance to undefined . The current object, p , is then written to output.
 
Search WWH ::




Custom Search