Database Reference
In-Depth Information
R . Then, for each instance x
R , the center-score of x is the maximum distance from
x to another point in R . The instance in R of the minimum center-score is chosen as
the center. This center approximation procedure is popularly used in computational
geometry. It takes O
( |
O N | )
time for each node N , and O
( |
O
|
log t |
O
| )
time for all
nodes in the LT-tree.
Radius
Once the center c of a node N is chosen, the radius is given by the maximum distance
between c and the other instances at N . This can be computed in time O ( | O N
| )
for
each node N , and O
( |
O
|
log t |
O
| )
for all nodes in the LT-tree.
σ
-neighborhood
We use a range query in the LT-tree to compute a superset of the
-neighborhood of
O N for every node N in the LT-tree, which always achieves a typicality approx-
imation no worse than using the
σ
-neighborhood. To compute the superset, we
start from the root and iteratively search for the nodes that completely lie in the
σ
σ
-neighborhood of N , using the approximate center and radius of N . Once all ob-
jects at a node N are in the
-neighborhood of N ,weuse N to represent them and
do not search any subtrees of N .
σ
4.2.3.2 Query Answering
To answer a top- k simple typicality query, we run tournaments on the LT-tree in a
bottom-up fashion. First, a tournament is run for each leaf node in the LT-tree. The
winner enters the tournament at the parent node of the leaf node. The winner o 1 at
the root node is the approximation of the most typical instance. Figure 4.2 illustrates
the procedure of computing the approximate most typical point in the set of points
in Figure 4.1 using an LT-tree. During the tournaments in the leaf nodes,
{
b
,
c
,
e
,
h
}
are selected as local winners and sent to the parent nodes. Then,
are selected
in the tournaments in nodes N 2 and N 3 . Finally, e is selected as the winner, which
approximates the most typical point in the data set.
To find the approximation of the second most typical instance, we do not need
to completely run the tournaments again. Instead, we can reuse most of the results
in the tournaments of finding the most typical instance w 1 . The only tournaments
we need to rerun are on the nodes containing w 1 . First, we run a new tournament at
the leaf node N containing w 1 , but do not include w 1 in the new tournament. Then,
the winner w 1 is sent to N p , the parent of N , and a new tournament is run there by
replacing w 1 by w 1 . A series of m tournaments are needed to find a new winner w 2
in the root node, which is the approximation of the second most typical instance. At
each level of the LT-tree, only one node needs to run a tournament. For example, in
{
c
,
e
}
Search WWH ::




Custom Search