Database Reference
In-Depth Information
Chapter 2
Crawling Twitter Data
Users on Twitter generate over 400 million Tweets everyday. 1 Some of these Tweets
are available to researchers and practitioners through public APIs at no cost. In
this chapter we will learn how to extract the following types of information from
Twitter:
￿
Information about a user,
￿
A user's network consisting of his connections,
￿
Tweets published by a user, and
￿
Search results on Twitter.
APIs to access Twitter data can be classified into two types based on their design
and access method:
REST APIs are based on the REST architecture 2 now popularly used for
designing web APIs. These APIs use the pull strategy for data retrieval. To collect
information a user must explicitly request it.
￿
￿
Streaming APIs provides a continuous stream of public information from
Twitter. These APIs use the push strategy for data retrieval. Once a request for
information is made, the Streaming APIs provide a continuous stream of updates
with no further input from the user.
They have different capabilities and limitations with respect to what and how
much information can be retrieved. The Streaming API has three types of end-
points:
￿
Public streams: These are streams containing the public Tweets on Twitter.
￿
User streams: These are single-user streams, with to all the Tweets of a user.
￿
Site streams: These are multi-user streams and intended for applications which
access Tweets from multiple users.
 
Search WWH ::




Custom Search