Database Reference
In-Depth Information
that tag outside the segment. However, the segment size and the number of usage
occurrences inside and outside the segments significantly affect the analysis.
The evaluation has been focused on photos from the San Francisco Bay area.
Experimental analysis has been performed on a dataset including both photos and
tags. The dataset consists of 49,897 photos with an average of 3.74 tags per photo.
Each photo is also characterized by a location and a time. The location represents
the latitude-longitude coordinates either of the place where the photo was taken or
of the photographed object. The time represents either the photo capture time or the
time the photo was uploaded to Flickr. These photos cover a temporal range of
1,015 days, starting from 1 January 2004. The average number of photos per day
was 49.16, with a minimum of zero and a maximum of 643. From these photos, 803
unique tags were extracted. The maximum number of photos associated with a
single tag was 34,325 for the San Francisco Bay area, and the mean was 232.26. The
method described in [ 41 ] achieves good precision in classifying tags as either a
place or event.
A parallel effort has been devoted to enhancing the approach proposed in [ 41 ]to
efficiently mine the huge photographic dataset managed by Flickr. The approach
proposed in [ 42 ] is organized in three steps. First, the issue of generating represen-
tative tags for arbitrary areas in the world is addressed using a location-driven
approach. Georeferences associated with the uploaded photographs are initially
exploited to cluster photographs. Candidate tags within each cluster are then ranked
to select the best representative ones. The extracted tags often correspond to land-
marks within the selected area. Second, the method to identify tag semantics pro-
posed in [ 41 ] has been exploited. The method allows the automatic identification of
tags as places and/or events based on temporal and spatial tag usage distributions.
Lastly, tag-location-driven analysis is combined with computer vision techniques to
achieve the automatic selection of representative photographs of some landmark or
geographic feature. Tags that represent landmarks and places are initially selected
by the aforementioned location-driven approach. For each tag, the corresponding
images are clustered by the k-Means [ 33 ] clustering algorithm according to their
visual content to discover varying views of the landmark in question. For this
purpose, range of complementary visual features is extracted from the images.
Clusters are subsequently ranked by applying four distinct methods so as to identify
the ones which best represent the various views associated with a given tag or
location. Finally, images within each cluster are also ranked according to how well
they represent the cluster.
The proposed techniques have been evaluated on a set of over 110,000 geo-
referenced photos from the San Francisco area by manually selecting ten landmarks
of interest. Results showed that the tag-location-visual-based approach is able to
select representative images for a given landmark with an increase in precision of
more than 45% over the best nonvisual technique (tag-location based). Across most
of the locations, all of the selected images were representative. For some geograph-
ical features, the visual-based methods still do not provide perfect precision in
image summaries, mainly due to the complex variety of scenarios and ambiguities
connected with the notion of representativeness .
Search WWH ::




Custom Search