Database Reference
In-Depth Information
Figure 9.4 The intuitions behind LDA
The reader can refer to the original paper [29] for the mathematical detail of LDA.
Basically, LDA can be viewed as a case of hierarchical Bayesian estimation with a
posterior distribution to group data such as documents with similar topics.
Many programming tools provide software packages that can perform LDA over
datasets. R comes with an lda package [31] that has built-in functions and sample
datasets. The lda package was developed by David M. Blei's research group [32].
Figure 9.5 shows the distributions of ten topics on nine scientific documents
randomly drawn from the cora dataset of the lda package. The cora dataset is
a collection of 2,410 scientific documents extracted from the Cora search engine
[33].
Search WWH ::




Custom Search