Information Technology Reference
In-Depth Information
Fig. 5.3
Solution framework
interval, the keywords indicating the trending topics appear as high-frequency words
and tend to be assigned to the background topic. To address this, we conduct Twitter-
LDA on the whole Twitter dataset to obtain a global background topic, and then fix
the background topic to run Twitter-LDA at each local time interval.
Our YouTube video recommender works on a daily basis, i.e., retrieving the
overlapped users' Twitter activity data to discover short-term interest and perform
video recommendation each day. Therefore, we set the local time interval to one
day for short-term interest modeling on Twitter. Specifically, with all the overlapped
users' tweet/reweet in a daily constructing one collection, we run Twitter-LDA to
obtain the topic-word distribution
ˆ
1
:
K
and user's daily Twitter interest distribution
u
. With the focus on recommending the most promising topic from Twitter to
YouTube, we only keep the most probable topic user involves on Twitter each day,
i.e., argmax
ʸ
u
k
. Therefore, we can simply calculate the relevance score of a YouTube
video
v
with user
u
based on short-term interest as the cosine distance between their
Vector Space Model (VSM):
ʸ
v
T
z
k
√
v
T
v
z
k
T
z
k
s
short
(
v
,
u
)
=
(5.1)