Database Reference
In-Depth Information
are user input parameters. However, “core points” are significantly different from
“typical points” in the following two aspects.
First, a “core point” may not be typical. Consider an extreme case where there
are two groups of points: the first group of points lie close to each other with a size
much larger than MinPts , while the second group only contain MinPts points lying
within a radius Eps from a point o that are far away from the points in the first
group. then, o is a core point but it is not typical at all. Second, a typical point may
not be a “core point”, either. It is possible that a typical point does not have MinPts
points lying within a distance Eps from it, but it still has a high typicality score. A
comparison between clustering and typicality analysis on real data sets is given in
Chapter 4.
It is possible to extend the existing clustering methods to answer typicality
queries, by defining the most typical object in a cluster as the centroid and using
the maximal group typicality of clusters as the clustering criteria, which is in the
same spirit as our typicality query evaluation algorithms.
3.3.4 Other Related Models
Typicality probability [93, 94] in statistical discriminant analysis is defined as the
Mahalanobis distance between an object and the centroid of a specified group, which
provides an absolute measure of the degree of membership to the specified group.
Spatially-decaying aggregation [95, 96] is defined as the aggregation values in-
fluenced by the distance between data items. Generally, the contribution of a data
item to the aggregation value at certain location decays as its distance to that location
increases. Nearly linear time algorithms are proposed to compute the
-approximate
aggregation values when the metric space is defined on a graph or on the Euclidean
plane.
ε
3.3.4.1 How Is Our Study Related?
Discriminant analysis mainly focuses on how to correctly classify the objects. It
does not consider the typicality of group members. Our definition of discriminative
typicality combines both the discriminability and the typicality of the group mem-
bers, which is more powerful in capturing the “important” instances in multi-class
data sets. Moreover, [93, 94] do not discuss how to answer those queries efficiently
on large data sets.
Spatially-decaying sum with exponential decay function [95, 96] is similar to our
definition of simple typicality. However, in [95, 96], the spatially-decaying aggre-
gation problem is defined on graphs or Euclidean planes, while we assume only
a generic metric space. The efficiency in [95, 96] may not be carried forward to
the more general metric space. The techniques developed developed in Chapter 4
may be useful to compute spatially-decaying aggregation on a general metric space.
Search WWH ::




Custom Search