Related Work - Ranking Queries on Uncertain Data

Database Reference

In-Depth Information

are user input parameters. However, “core points” are significantly different from

“typical points” in the following two aspects.

First, a “core point” may not be typical. Consider an extreme case where there

are two groups of points: the first group of points lie close to each other with a size

much larger than MinPts , while the second group only contain MinPts points lying

within a radius Eps from a point o that are far away from the points in the first

group. then, o is a core point but it is not typical at all. Second, a typical point may

not be a “core point”, either. It is possible that a typical point does not have MinPts

points lying within a distance Eps from it, but it still has a high typicality score. A

comparison between clustering and typicality analysis on real data sets is given in

Chapter 4.

It is possible to extend the existing clustering methods to answer typicality

queries, by defining the most typical object in a cluster as the centroid and using

the maximal group typicality of clusters as the clustering criteria, which is in the

same spirit as our typicality query evaluation algorithms.

3.3.4 Other Related Models

Typicality probability [93, 94] in statistical discriminant analysis is defined as the

Mahalanobis distance between an object and the centroid of a specified group, which

provides an absolute measure of the degree of membership to the specified group.

Spatially-decaying aggregation [95, 96] is defined as the aggregation values in-

fluenced by the distance between data items. Generally, the contribution of a data

item to the aggregation value at certain location decays as its distance to that location

increases. Nearly linear time algorithms are proposed to compute the

-approximate

aggregation values when the metric space is defined on a graph or on the Euclidean

plane.

ε

3.3.4.1 How Is Our Study Related?

Discriminant analysis mainly focuses on how to correctly classify the objects. It

does not consider the typicality of group members. Our definition of discriminative

typicality combines both the discriminability and the typicality of the group mem-

bers, which is more powerful in capturing the “important” instances in multi-class

data sets. Moreover, [93, 94] do not discuss how to answer those queries efficiently

on large data sets.

Spatially-decaying sum with exponential decay function [95, 96] is similar to our

definition of simple typicality. However, in [95, 96], the spatially-decaying aggre-

gation problem is defined on graphs or Euclidean planes, while we assume only

a generic metric space. The efficiency in [95, 96] may not be carried forward to

the more general metric space. The techniques developed developed in Chapter 4

may be useful to compute spatially-decaying aggregation on a general metric space.

Search WWH ::

Custom Search

Home