Database Reference
In-Depth Information
in Figure 4.2 from the root node N 1 , and recursively examine each internal node.
During the search, node N 4 can be pruned since all points in N 4 lie out of the
σ
-
neighborhood of N 6 .
The
cost
of
computing
the
local
simple
typicality
of
an
instance x is
O
( |
LN
( {
x
},
O
, σ ) | )
. Then, computing the local simple typicality of all instances in
( x O |
O takes O
time. Although the search can be sped up using a
VP-tree, the complexity of the DLTA algorithm is still O
LN
( {
x
},
O
, σ ) | )
2
( |
|
)
O
. The reason is, in the
worst case where
σ
is larger than the diameter (i.e., the largest pairwise distance) of
the data set, the
σ
-neighborhood of each instance contains all other instances in O .
4.2.3 LT3: Local Typicality Approximation Using Tournaments
To reduce the cost of computing local simple typicality further, we incorporate the
tournament mechanism, and propose a local typicality approximation algorithm us-
ing tournaments (the LT3 algorithm). The basic idea is to group the instances locally,
and conduct a tournament in each local group of instances. The instances with the
largest local simple typicality is selected as the winner. The winners are sent to the
next round of tournament. The tournaments terminate when only one instance is left
as the winner. A sampling method is employed to reduce the computational cost.
4.2.3.1 Local Typicality Trees (LT-trees)
A local typicality tree (LT-tree) is an MVP-tree [180] with auxiliary information that
supports local typicality calculation and tournaments. Given an uncertain object O ,
an LT-tree can be constructed as follows.
First, we construct an MVP-tree [180] on O , which is a t -nary VP-tree that uses
more than one vantage point to partition the space. Without loss of generality, let
us assume t
2 l and the data set contains t m instances. We assign a layer number
to each node in the MVP-tree. The root node has layer number 0, and a node is
assigned layer number
=
if its parent has layer number i . We remove all those
nodes in the MVP-tree whose layer number is not a multiple of t . For a node N
of layer number jt
(
i
+
1
)
(
j
1
)
, we connect N to its ancestor in the MVP-tree of layer
(
t .
Second, we compute three pieces of information, the approximate center, the
radius, and the
j
1
)
σ
-neighborhood, for each node in the LT-tree.
Approximate Center
For a node N in the LT-tree, let O N be the set of instances at N . To compute the
approximate center at a node N in the LT-tree, we draw a sample R of |
in-
stances from O N , and compute the pairwise distance between every two instances in
O N |
Search WWH ::




Custom Search