Geography Reference
In-Depth Information
the profile of the cluster. For example, if we know that a person often takes photos
of sport events and nature, then occurrences of his photos in a cluster may help us
to identify the semantics of a cluster.
4.2 Potential Sources of Semantic Data
The following is the list of potential data that can be used to extract semantic
information about the places or events. The primary source of information is the
photo collection data that include all the relevant information like coordinates, tags
and titles. Additional sources of information are Wikipedia encyclopedic pages
and the GeoNames database.
4.2.1 Geo-referenced Photo Collections
Panoramio contains millions of geotagged photos. It is used by Google Maps and
Google Earth as one of the visualization layers. Its publically available API allows
the downloading of photo metadata by providing a bounding box of the desired
area. The following is the most important information provided by the API: photo
id and coordinates, owner id and name, photo url and title.
Another source of geotagged photos is Flickr. Flickr has a larger user database
and its API allows for receiving more meta information than Panoramio, such as
thematic photo groups, contacts (favourites) of users and user information
including place of residence (filled by 13 % of users). The Flickr API does not
allow downloading metadata by specifying exact boundaries of the area of interest.
Therefore, we used an approach similar to Web crawling. We downloaded all the
photo metadata of arbitrarily selected subjects and obtained the list of their con-
tacts as well as the list of groups their photos belong to. The same procedure was
iteratively applied on other retrieved users. We began collecting the data from the
beginning of June, 2009. By the end of March 2010, we collected 87,665,970
entries from 7,449,723 users and 394,830 thematic photo groups. This amount of
data allows us to analyze virtually every place on the Earth if it was previously
visited by photographers.
We are aware that user-generated data like photo collections can include
incorrect spatial and temporal information. For example, 10,117 photos did not
include the date and 55,176 photos are dated after 2010 after the time of collection.
These photos have to be excluded from the temporal analysis. However, there are
cases in which it is difficult or impossible to detect incorrect entry: adjustment of
the camera clock to the local time (in most cases adjusted manually by the person)
or correct geotagging during the upload process (if the camera was not equipped
with GPS). Still, not all of these problems are critical. Spatial aggregation does not
require timestamps. Aggregation level in space and time may be larger than
position or time reference errors.
Search WWH ::




Custom Search