Database Reference
In-Depth Information
typicality of an instance o
O on attributes A i 1 ,···,
A i l
(1
i j
n for 1
j
l )
is DT
(
o
,U ,V )=
T
(
o
,U )
T
(
o
,V )
, where T
(
o
,U )
and T
(
o
,V )
are the simple
typicality values of instance o with respect to
U
and
V
, respectively.
In the definition, the discriminative typicality of an instance is defined as the
difference of its simple typicality in the target object and that in the rest of the
data set. One may wonder whether using the ratio
T ( o ,U )
T
may also be meaningful.
Unfortunately, such a ratio-based definition may not choose a typical instance that
has a large simple typicality value with respect to
(
o
,V )
. Consider an extreme example.
Let o be an instance that is very atypical with respect to
U
U
and has a typicality value
. Then, o still has an infinite ratio T ( o ,U )
T
of nearly 0 with respect to
V
. Although o
(
o
,V )
is discriminative between
U
and
V
, it is not typical with respect to
U
at all.
Due
to
the
unknown
distribution
of
random
vectors
U
and
V
,weuse
DT
(
o
,
O
,
S
)=
T
(
o
,
O
)
T
(
o
,
S
)
to estimate DT
(
o
,U ,V )
, where T
(
o
,
O
)
and T
(
o
,
S
)
are the estimators of T
, respectively.
Given a set of uncertain instances on attributes A i 1 ,...,
(
o
,U )
and T
(
o
,V )
A i l of interest, a predicate
P and a positive integer k ,a top- k discriminative typicality query treats the set
of instances satisfying P as the target object, and returns the k instances in the tar-
get object having the largest discriminative typicality values computed on attributes
A i 1 ,...,
A i l .
Example 2.6 (Top-k discriminative typicality queries). Consider the set of points in
Figure 2.1(a) again and a top- 3 discriminative typicality query on attribute X with
predicate COLOR
white.
The discriminative typicality DT
=
white is
plotted in the figure, where white and black denote the two uncertain objects, the one
with white points as instances and the one with black points as instances, respec-
tively. To see the difference between discriminative typicality and simple typicality,
consider instance a, b and c, which have large simple typicality values among all
white points. However, they also have relatively high simple typicality values as a
member in the subset of black points comparing to other white points. Therefore,
they are not discriminative. Points
(
o
,
white
,
black
)
for each instance o
{
d
,
e
,
f
}
are the answer to the query, since they
are discriminative.
2.2.1.3 Representative Typicality
The answer to a top- k simple typicality query may contain some similar instances,
since the instances with similar attribute values may have similar simple typicality
scores. However, in some situations, it is redundant to report many similar instances.
Instead, a user may want to explore the uncertain object by viewing typical instances
that are different from each other but jointly represent the uncertain object well.
Suppose a subset of instances A
O is chosen to represent O . Each instances in
(
O
A
)
is best represented by the closest instance in A . For each o
A , we define
the representing region of o .
 
Search WWH ::




Custom Search