Database Reference
In-Depth Information
apply the model to real life text datasets: How to eciently and accurately
compute κ h ,h =1 ,...,k from (6.11) for high-dimensional data? The problem
of estimating κ h is analyzed in Section 6.5.1 and experimentally studied in
Section 6.5.2.
6.5.1 Approximating κ
Recall that due to the lack of an analytical solution, it is not possible to
directly estimate the κ values (see ( 6.5 ) and (6.11)). One may employ a non-
linear root-finder for estimating κ , but for high dimensional data, problems of
overflows and numerical instabilities plague such root-finders. Therefore, an
asymptotic approximation of κ is the best choice for estimating κ . Such ap-
proaches also have the benefit of taking constant computation time as opposed
to any iterative method.
Mardia and Jupp (30) provide approximations for estimating κ for a single
component (6.5) for two limiting cases (Approximations (10.3.7) and (10.3.10)
of (30, pp. 198)):
d
1
κ
valid for large r,
(6.14)
2(1
r )
dr 1+
( d +2) 2 ( d +4) r 4
d 2 ( d +8)
d
d +2 r 2 +
κ
valid for small r,
(6.15)
where r is given by (6.5).
These approximations assume that κ
d , which is typically not valid for
high dimensional data (see the discussion in Section 6.8 for an intuition).
Furthermore, the r values corresponding to the text datasets considered in
this chapter are in the mid-range rather than in the two extreme ranges of r
that are catered to by the above approximations. We obtain a more accurate
approximation for κ as described below. With A d ( κ )= I d/ 2 ( κ )
I d/ 2 1 ( κ ) ,observe
that A d ( κ ) is a ratio of Bessel functions that differ in their order by just one.
Fortunately there exists a continuedfractionrepresentationof A d ( κ ) (52)
given by
I d/ 2 ( κ )
I d/ 2 1 ( κ )
1
A d ( κ )=
=
.
(6.16)
1
d
κ +
d +2
κ
+
···
Letting A d ( κ )= r , we can write (6.16) approximately as
1
r
d
κ + r,
which yields
dr
κ
.
1
r 2
Search WWH ::




Custom Search