Database Reference
In-Depth Information
generated by Bayesian topic models such as LDA (4).
TABLE 6.9:
Five of the topics obtained by running
slash-7
batch vMF on
.
music
web
scientists
internet
games
apple
google
nasa
broadband
gaming
itunes
search
space
domain
game
riaa
yahoo
researchers
net
nintendo
ipod
site
science
network
sony
wikipedia
online
years
verisign
xbox
digital
sites
earth
bittorrent
gamers
napster
ebay
found
icann
wii
file
amazon
brain
service
console
drm
engine
university
access
video
6.8 Discussion
The mixture of vMF distributions gives a parametric model-based gener-
alization of the widely used cosine similarity measure. As discussed in Sec-
tion 6.6, the spherical kmeans algorithm that uses cosine similarity arises as a
special case of EM on mixture of vMFs when, among other things, the concen-
tration κ of all the distributions is held constant. Interestingly, an alternative
and more formal connection can be made from an information geometry view-
point (2). More precisely, consider a dataset that has been sampled following a
vMF distribution with a given κ ,say κ = 1. Assuming the Fisher-Information
matrix is identity, the Fisher kernel similarity (25) corresponding to the vMF
distribution is given by
μ )) T (
K ( x i , x j )=(
μ ln f ( x i |
μ ln f ( x j |
μ ))
(see (6.1))
μ ( μ T x j )) = x i x j ,
which is exactly the cosine similarity. This provides a theoretical justification
for a long-practiced approach in the information retrieval community.
In terms of performance, the magnitude of improvement shown by the
soft-movMF algorithm for the dicult clustering tasks was surprising, espe-
cially since for low-dimensional non-directional data, the improvements using
a soft, EM-based kmeans or fuzzy kmeans over the standard hard-assignment
based versions are often quite minimal. In particular, a couple of issues ap-
pear intriguing: (i) why is soft-movMF performing substantially better than
μ ( μ T x i )) T (
=(
 
Search WWH ::




Custom Search