Information Technology Reference
In-Depth Information
We collected tweet data using StreamingAPI from Twitter. 1 This data
contains the user name, date, day, time, and tweet text. MeCab 2 was used
for a morphological analysis of Japanese tweets containing “now,” and we
extracted clauses before “now” that expressed human behaviour, such as
nouns, verbs, and adjectives. Table 4.1 shows some examples of tweet
data. For instance, if the tweet says, “staying in café, now,” the
corresponding event is the word “café” before “now.”
We considered two sets of tweets. One set consisted of 18,584 tweets
collected between 20th and 26th June 2011 containing 6,318 different
events (dataset #1). The other set included 19,805 tweets collected
between 13th and 19th September 2011. This data contained 9,494
different events (dataset #2). We analysed and compared the events in both
sets. Our program of data collection using StreamingAPI displayed
unstable behaviour. Some parts of the data were not retrieved completely,
but we do not believe that this seriously influences the illustration of our
case study. Of course, it would be desirable that the data were complete to
make the analysis more reliable.
User
Date
Time
Tweet
Event
White_luc
20/06/2011
17:30:24
Staying in caf é , now.
Caf é
Yamahaku
20/06/2011
17:33:09
Got home, now.
Got home
Adajmdap
20/06/2011
17:35:12
Staying in Tokyo, now.
Tokyo
Table 4.1. Examples of data including temporal information.
(Here, “got home” is one word in Japanese.)
Observations
From the ChronoView of tweet dataset #1, we found that the size of the
events “wake up” and “lunch” are almost the same (Fig. 4.1). This means
that the two events have similar frequencies. However, “wake up” is
positioned farther from the circumference than “lunch.” We can infer from
this that the occurrence time of “wake up” has a broad spread. On the
other hand, “lunch” was placed near to the circumference, suggesting that
this event is concentrated at a specific time. “Wake up” was placed near a
position from 06:00 to 12:00. This event occurred over a wide range of
time. In contrast, “lunch” was concentrated between 11:00 and 14:00 -
obviously, “lunch” depends on a specific time. From the radial lines
1. https://dev.twitter.com/docs/streaming-api
2. http://mecab.sourceforge.net/
Search WWH ::




Custom Search