Database Reference
In-Depth Information
each Tweet is written to a file periodically. This behavior can be modified as per
the requirement of the application, such as storing and indexing the Tweets in a
database. More discussion on the storage and indexing of Tweets will follow in
Chap. 3 .
Key Parameters : There are three key parameters:
￿ follow : a comma-separated list of userids to follow. Twitter returns all of their
public Tweets in the stream.
￿ track : a comma-separated list of keywords to track. Multiple keywords are
provided as a comma separated list.
￿ locations : a comma-separated list of geographic bounding box containing the
coordinates of the southwest point and the northeast point as (longitude, latitude)
pairs.
Rate Limit : Streaming APIs limit the number of parameters which can be
supplied in one request. Up to 400 keywords, 25 geographic bounding boxes and
5,000 userids can be provided in one request. In addition, the API returns all
matching documents up to a volume equal to the streaming cap. This cap is currently
set to 1% of the total current volume of Tweets published on Twitter.
2.6
Strategies to Identify the Location of a Tweet
Location information on Twitter is available from two different sources:
￿
Geotagging information: Users can optionally choose to provide location infor-
mation for the Tweets they publish. This information can be highly accurate if
the Tweet was published using a smartphone with GPS capabilities.
￿
Profile of the user: User location can be extracted from the location field in the
user's profile. The information in the location field itself can be extracted using
the APIs discussed above.
Approximately 1% of all Tweets published on Twitter are geolocated. This is
a very small portion of the Tweets, and it is often necessary to use the profile
information to determine the Tweet's location. This information can be used in
different visualizations as you will see in Chap. 5 . The location string obtained from
the user's profile must first be translated into geographic coordinates. Typically, a
gazetteer is used to perform this task. A gazetteer takes a location string as input,
and returns the coordinates of the location that best correspond to the string. The
granularity of the location is generally coarse. For example, in the case of large
regions, such as cities, this is usually the center of the city. There are several
online gazetteers which provide this service, including Bing™, Google™, and
MapQuest™. In our example, we will use the Nominatim service from MapQuest 11
11 http://developer.mapquest.com/web/products/open/nominatim
Search WWH ::




Custom Search