Database Reference
In-Depth Information
answers to top- k typicality queries with respect to the choice of kernel functions
and bandwidth values. The results show that the answers computed using different
kernel functions listed in Table 4.1 are mostly consistent. Moreover, using different
bandwidth values around h
0 6 s
n also provide consistent answers.
Outliers in instances may increase the standard deviation of the instances in O ,
and thus lead to larger bandwidth values, which may impair the quality of the an-
swers to typicality queries. Therefore, for better performance, we can remove out-
liers in the set of instances as preprocessing. There are extensive studies on effective
and efficient outlier detection [176, 177, 178], which can be used as a screening step
in our methods. Moreover, it is shown from the experimental results in Section 4.5.3
that, even on an uncertain object containing a non-trivial amount of noise, the results
returned by top- k typicality queries are often consistent with the results found when
outliers are removed.
1
.
=
Name
Kernel function
1
2 I ( | u |≤ 1 )
K ( u )=
Uniform
Triangle
K
(
u
)=(
1
−|
u
| )
I
( |
u
|≤
1
)
3
4
u 2
Epanechnikov K
(
u
)=
(
1
)
I
( |
u
|≤
1
)
15
16
u 2
2 I
Quartic
K
(
u
)=
(
1
)
( |
u
|≤
1
)
35
32
u 2
3 I
Triweight K
(
u
)=
(
1
)
( |
u
|≤
1
)
1
2 u 2
1
2
e
Gaussian
K
(
u
)=
π
)= 4 cos
( 2 u
Cosine
K
(
u
)
I
( |
u
|≤
1
)
Table 4.1 The commonly used kernel functions.
Since we address the top- k typicality problem in a generic metric space, the only
parameter we use in density estimation is the distance (or similarity) between two
instances. Formally, given an uncertain object O
=(
o 1 ,
o 2 ,...,
o n )
in a generic met-
ric space, the underlying likelihood function is approximated as
n
i = 1 G h ( x , o i )=
i = 1 e d ( x , o i ) 2
n
1
nh
1
nh 2
L
(
x
|
O
)=
2 h 2
(4.1)
π
(
,
)
is the distance between x and o i in the metric space, and G h (
x
,
o i
)=
where d
x
o i
2
2 h 2 is a Gaussian kernel .
Hereafter, by default, we assume that outliers are removed using the techniques
discussed in Section 4.1.1.
d ( x , o i )
e
1
2
π
4.1.2 An Exact Algorithm and Complexity
Theoretically, given an uncertain object O , if the likelihood of an instance o
O
1
o O d
satisfies L
(
o
|
O
) ∝
, then the discrete 1-median problem can be reduced
o )
(
o
,
 
Search WWH ::




Custom Search