Advanced Analytical Theory and Methods: Text Analysis - Data Science and Big Data Analytics

Database Reference

In-Depth Information

Figure 9.5 Distributions of ten topics over nine scientific documents from the

Cora dataset

The code that follows shows how to generate a graph similar to Figure 9.5 using R

and add-on packages such as lda and ggplot .

require("ggplot2")

require("reshape2")

require("lda")

# load documents and vocabulary

data(cora.documents)

data(cora.vocab)

theme_set(theme_bw())

# Number of topic clusters to display

K <- 10

# Number of documents to display

N <- 9

result <- lda.collapsed.gibbs.sampler(cora.documents,

K, ## Num clusters

cora.vocab,

25, ## Num iterations

0.1,

compute.log.likelihood=TRUE)

Search WWH ::

Custom Search

Home