Information Technology Reference
In-Depth Information
2.5 Conclusion and Future Works
2.5.1 Conclusion
Twitter is one of the most popular micro-blogging and social networking services.
It allows visitors to read and post short messages limited to 140 characters. With
its increasing popularity as a micro-blogging system, Twitter has become one of
the best ways of monitoring the views of the users regarding certain products or
things, in general. What is more, the enhanced use of the system and sharing of the
users' views about specific matters before the actual date of their appearance as a
concrete happening make it possible to make predictions by analyzing the current
tweets. Twitter helps to identify the negative and positive opinions about a brand or
a product. In order to manage the analysis of the users' posts about certain issues
or things, a web-based tracking sentiment system for Twitter is developed which is
able to satisfy the requirements of Carl described in the use case section.
In Sect. 2.2 , first, blogging and micro-blogging are described. Afterward, Twitter
and statistical data about Twitter are presented. Then, a short story of the Natural
Language Processing is provided. The concept of sentiment analysis is also deeply
analyzed and the Naïve Bayes and Maximum Entropy classification methods are
briefly explained.
The technologies used in this chapter are laconically introduced in Sect. 2.3 .
First, Python and Natural Language Tool Kit are introduced. Afterward, the prop-
erties of Twitter and the Twitter API are elaborated, followed by Google API. The
implementation section also covers the subsections: hand-classified dataset, back-
ground processes, and web interface. The hand-classified dataset includes 1,035
hand-classified tweets as positive and negative. To enhance the performance of our
classification methods, tweet texts in the training dataset and the tweets collected via
Twitter API are preprocessed. How our Python script collects and classifies tweets is
represented in the subsection of background processes. In this section, basic proper-
ties of our Web Interface are described. How our system performs scheduled tasks,
such as hourly tweet collection, is also expressed in this section.
Some evaluations are made in Sect. 2.4 to test and increase the performance of the
Naïve Bayes and Maximum Entropy classifiers. For example, the performance of the
Naïve Bayes and Maximum Entropy classification methods are evaluated by altering
the number of training data. How the Maximum Entropy classification method is
affected by the change in the number of iteration is shown as well.
It is also demonstrated that the Naïve Bayes classification and the Maximum
Entropy classification methods can be used to conduct the sentiment analysis, but
Maximum Entropy method is quite slow during the training process in comparison
to the Naïve Bayes. Furthermore, 362,529 tweets are collected and automatically
classified, which is described in Sect. 2.3 . While collecting the tweets about the four
companies in 9 days, we noticed that current news about brands have an impact on
the views of Twitter users. During tracking, on August 23 Microsoft announced that
the brand logo was changed. A lot of tweets were collected about that news and it was
Search WWH ::




Custom Search