Crawling Twitter Data - Twitter Data Analytics

Database Reference

In-Depth Information

Chapter 2

Crawling Twitter Data

Users on Twitter generate over 400 million Tweets everyday. 1 Some of these Tweets

are available to researchers and practitioners through public APIs at no cost. In

this chapter we will learn how to extract the following types of information from

Twitter:

Information about a user,

A user's network consisting of his connections,

Tweets published by a user, and

Search results on Twitter.

APIs to access Twitter data can be classified into two types based on their design

and access method:

REST APIs are based on the REST architecture 2 now popularly used for

designing web APIs. These APIs use the pull strategy for data retrieval. To collect

information a user must explicitly request it.

Streaming APIs provides a continuous stream of public information from

Twitter. These APIs use the push strategy for data retrieval. Once a request for

information is made, the Streaming APIs provide a continuous stream of updates

with no further input from the user.

They have different capabilities and limitations with respect to what and how

much information can be retrieved. The Streaming API has three types of end-

points:

Public streams: These are streams containing the public Tweets on Twitter.

User streams: These are single-user streams, with to all the Tweets of a user.

Site streams: These are multi-user streams and intended for applications which

access Tweets from multiple users.

Search WWH ::

Custom Search

Home