Database Reference
In-Depth Information
β k
Σ
η d
Z dn
W dn
K
N
μ
D
FIGURE 4.6 : The graphical model for the correlated topic model in Sec-
tion 4.4.1.
4.4.2 The Dynamic Topic Model
LDA and the CTM assume that words are exchangeable within each docu-
ment, i.e., their order does not affect their probability under the model. This
assumption is a simplification that it is consistent with the goal of identifying
the semantic themes within each document.
But LDA and the CTM further assume that documents are exchangeable
within the corpus, and, for many corpora, this assumption is inappropri-
ate. Scholarly journals, email, news articles, and search query logs all reflect
evolving content. For example, the Science articles “The Brain of Professor
Laborde” and “Reshaping the Cortical Motor Map by Unmasking Latent In-
tracortical Connections” may both concern aspects of neuroscience, but the
field of neuroscience looked much different in 1903 than it did in 1991. The
topics of a document collection evolve over time. In this section, we describe
how to explicitly model and uncover the dynamics of the underlying topics.
The dynamic topic model (DTM) captures the evolution of topics in a se-
quentially organized corpus of documents. In the DTM, we divide the data
by time slice, e.g., by year. We model the documents of each slice with a K -
component topic model, where the topics associated with slice t evolve from
the topics associated with slice t
1.
Again, we avail ourselves of the logistic normal distribution, this time using
it to capture uncertainty about the time-series topics. We model sequences of
simplicial random variables by chaining Gaussian distributions in a dynamic
model and mapping the emitted values to the simplex. This is an extension
of the logistic normal to time-series simplex data (39).
For a K -component model with V terms, let π t,k denote a multivariate
Gaussian random variable for topic k in slice t .
For each topic, we chain
{
π 1 ,k ,...,π T,k }
in a state space model that evolves with Gaussian noise:
( π t− 1 ,k 2 I ) .
π t,k |
π t− 1 ,k ∼N
(4.16)
When drawing words from these topics, we map the natural parameters back
to the simplex with the function f from Eq. (4.15). Note that the time-series
 
Search WWH ::




Custom Search