Database Reference
In-Depth Information
abstract prototypes that might not exist in real life. Finally, the schema view [74]
improves the prototype view by modeling concepts in schema theory and artificial
intelligence knowledge representation.
Feature-frequency model defines typicality from a different scope [75]. A typical
member of a category will share many attributes with other members of the category
and few attributes with members of other categories. An attribute can be scored
based on how many members possess that attribute. A family resemblance score
for each member sums up the numerical scores of all attributes possessed by that
member. A category member with a higher family resemblance score is considered
more typical.
Although typicality has not been used before in query answering on large
databases, the idea of typicality was recently introduced into ontology design and
conceptual modeling [76], which are generally related to database design.
3.3.1.1 How Is Our Study Related?
Our typicality measures are in the general spirit of typicality measures used in psy-
chology and cognitive science. As suggested by the previous studies in psychology
and cognitive science, typicality measures may vary in different applications. In
our study, we propose simple typicality, discriminative typicality, and representative
typicality for different application requirements.
Studies on typicality in psychology and cognitive science often do not address
the concerns about efficient query answering from large databases. Complementary
to those studies, we focus on efficient query answering.
3.3.2 The (Discrete) k-Median Problem
Finding typical objects is broadly related to the k -median problem in computational
geometry. Given a set S of n points, the k-median problem is to find a set M of k
points minimizing the sum of distances from all points in S to M . Points in M are
called the medians of S . Under the constraint that points in M belong to S ,itis
known as the discrete k-median problem . When k
=
1, we can find the exact median
n 2
in O
time. When k is an arbitrary input parameter, the discrete k median problem
on any distance metric is NP -hard [28].
Several approximation algorithms have been proposed to compute the approxi-
mate 1-median efficiently. [77] proposes a quad-tree based data structure to support
finding the approximate median with a constant approximation ratio in O
(
)
)
time. A randomized algorithm is proposed in [78], which computes the approxi-
mate median in linear time. Although the approximation ratio cannot be bounded,
it performs well in practice. [79] provides a
(
n log n
(
1
+ δ )
-approximation algorithm with
5
runtime O
based on sufficiently large sampling. [80] proposes an algorithm
to solve the median problem in L 1 metric in O
(
n
/ δ
)
(
n log n
)
time.
Search WWH ::




Custom Search