Database Reference
In-Depth Information
apply the model to real life text datasets: How to eciently and accurately
compute
κ
h
,h
=1
,...,k
from (6.11) for high-dimensional data? The problem
of estimating
κ
h
is analyzed in Section 6.5.1 and experimentally studied in
Section 6.5.2.
6.5.1 Approximating
κ
Recall that due to the lack of an analytical solution, it is not possible to
linear root-finder for estimating
κ
, but for high dimensional data, problems of
overflows and numerical instabilities plague such root-finders. Therefore, an
asymptotic approximation of
κ
is the best choice for estimating
κ
. Such ap-
proaches also have the benefit of taking constant computation time as opposed
to any iterative method.
Mardia and Jupp (30) provide approximations for estimating
κ
for a single
component (6.5) for two limiting cases (Approximations (10.3.7) and (10.3.10)
of (30, pp. 198)):
d
−
1
κ
≈
valid for large
r,
(6.14)
2(1
−
r
)
dr
1+
(
d
+2)
2
(
d
+4)
r
4
d
2
(
d
+8)
d
d
+2
r
2
+
κ
≈
valid for small
r,
(6.15)
where
r
is given by (6.5).
These approximations assume that
κ
d
, which is typically not valid for
high dimensional data (see the discussion in
Section 6.8
for an intuition).
Furthermore, the
r
values corresponding to the text datasets considered in
this chapter are in the mid-range rather than in the two extreme ranges of
r
that are catered to by the above approximations. We obtain a more accurate
approximation for
κ
as described below. With
A
d
(
κ
)=
I
d/
2
(
κ
)
I
d/
2
−
1
(
κ
)
,observe
that
A
d
(
κ
) is a ratio of Bessel functions that differ in their order by just one.
Fortunately there exists a continuedfractionrepresentationof
A
d
(
κ
) (52)
given by
I
d/
2
(
κ
)
I
d/
2
−
1
(
κ
)
1
A
d
(
κ
)=
=
.
(6.16)
1
d
κ
+
d
+2
κ
+
···
Letting
A
d
(
κ
)=
r
, we can write (6.16) approximately as
1
r
≈
d
κ
+
r,
which yields
dr
κ
≈
.
1
−
r
2
Search WWH ::
Custom Search