what-when-how
In Depth Tutorials and Information
with probability k u,v r u,v on any given day, but once the r u,v coin comes up heads, no
further trials are made.
hus, given the transmission graph (and, in particular, each edge's reading fre-
quency r and copy probability k ), the distribution of propagation patterns is now
fully established. Given a community and a timeout interval, the goal is therefore
to learn the arcs and associated probabilities from a set of episodes. Using these
probabilities, given the initial fragment of a new episode, it is able to predict the
propagation pattern of the episode.
In the following text, a closed-world assumption is made that all occurrences
of a topic except the first are the result of communication via edges in the network.
A topic in the following is a URL, phrase, name, or any other representation of a
meme that can be tracked from page to page. All blog entries can be gathered that
contain a particular topic into a list [(
u t
,
),(
u t
,
),...,(
u t
,
)]
1 1 2 2 sorted by the publi-
cation date of the blog, where u i is the universal identifier for blog user i , and t i is the
first time at which user u i contained a reference to the topic. his list can be referred
to as the traversalsequence for the topic. he following observation is critically used:
the fact that user a appears in a traversal sequence, and user b does not appear later
in the same sequence, gives us evidence about the ( a , b ) edge—that is, if b were a
regular reader of a 's blog with a reasonable copy probability, then sometimes memes
discussed by a should appear in b 's blog.
An EM-like algorithm is presented to induce the parameters of the transmis-
sion graph [51], in which the model first computes a “soft assignment” of each new
infection to the edges that may have caused it, and then updates the edge param-
eters to increase the likelihood of the assigned infections. Take an initial guess at
the value of r and k for each edge and improve the estimate of these values. A two-
stage process is adopted:
Soft-Assignment Step : Using the current version of the transmission graph,
compute for each topic and each pair ( u , v ) the probability that the topic traversed
the ( u , v ) edge. Given the traversal sequence and the delay between u and v for a
particular topic j as input, for each v in the sequence, consider all previous vertices
u in the sequence and compute the probability P u,v that topic j would have been
copied from u to v . hen normalize by the sum of these probabilities to compute
the posteriors of the probability that each node u was v 's source of inspiration.
hat is, setting r = r u,v , k = k u,v , and δ is to be the delay in days between u and v in
topic j :
k
k
r
(
1
1
r k
)
δ
=
P
:
(3.14)
u v
,
δ
r
(
r
)
k
w v
,
w v
,
w v
,
w v
,
<
w v
Parameter-UpdateStep : For fixed u and v , recompute r u,v and k u,v based on the pos-
terior probabilities just computed. Perform the following operation for each fixed u
and v . Let S 1 denote the set of topics j such that topic j appeared first at node u and
Search WWH ::




Custom Search