Geography Reference
In-Depth Information
Fig. 1 Study area around the London city centre
convenient way, the authors employed the Twitter4J 2 Java programming library.
The target place for this study is the city of London, UK. For a collecting area, the
authors selected a 28 km radius around the city centre. The defined radius distance
represents an approximation of London
s bounding box. Some parts of the outskirts
of greater London area were not completely covered in the collection procedure.
Areas not covered with tweets were not included in the further analysis (Fig. 1 ).
The Twitter data for this work have been collected over the course of 3 months:
July to September 2013. In total, about 250,000 Tweets were collected. These
tweets are all geolocated, which means that each tweet is located in space with
coordinates. Geolocated tweets make up a subset of around 1 % of the total number
of tweets (Morstatter et al. 2013 ). The information per tweet contains: The date of
the tweet as well as its associated time (HH:MM:SS), the Twitter user name, the
unique internal Twitter user ID, the coordinates of the tweet, and finally the
message of the tweet.
As a first step of pre-processing, the tweets were separated into day-time and
night-time tweets. The range for day-time is from 07:00 until 18:59:59, while the
range for the night-time Tweets is from 19:00 to 06:59:59. The size of the two
subsets is almost exactly 50 % each.
The second step of pre-processing regards the requirement to separate tourists
from locals as socio-demographic data are based on residents in an area. The
approach chosen to address this issue is to work with tweets from users, who posted
'
2 Twitter4J Library— http://twitter4j.org/en (2013-12-05).
Search WWH ::




Custom Search