Information Technology Reference
In-Depth Information
The second case in the list is more complex and interesting. It takes place when
users do not use the platforms' features to include their location in the message, but,
rather, mention the location which they're talking from or about in the text of the
message itself.
First of all, it is important to try to understand whether the mention of a
geographical location in a message is indicating that the message was produced
in that location, or if it was talking about it: these two possibilities may completely
change the relevancy of the message.
We have tried to formulate a working procedure with which to try and add
location information to these kinds of messages.
We :
￿
built databases of Named Places for the various cities, including landmarks, street
names, venues, restaurants, bars, shopping centers, and more, by combining the
information coming from
￿
publicly available data sets (for example for Italy we have used the named
places provided by ISTAT, Italy's National Statistics Institute, 2013 );
￿
the list of named places contained in the OpenStreetMap databases, for
example as described in OpenStreetMap 2013a , b ;
￿
the list of named places provided by social networks themselves, which allow
using their APIs to discover the locations used by users in writing their
messages, for example on Facebook ( 2013 ) or Foursquare ( 2013 );
￿
lists of relevant words and phrases, such as event names or landmarks;
￿
used the text representation in various forms of the named places in a series of
phrase templates to try to understand it the user writing the message was in the
place, going to the place, leaving the place, or talking about the place;
￿
for example, the template “*going to [named place]*” would identify the
action of going, while “*never been in [named place]*” would identify the
action of talking about a place;
￿
templates have currently been composed in 29 different languages, for a total
of more than 20,000 different templates;
￿
each template was assigned a degree of confidence, evaluating the level of
certainty according to which the sentence could be said to identify the intended
information;
￿
for example: “I'm going to [named place]” has a relevance of 1 (100 %), while
the “[named place]” taken by itself has a relevance of .2 (20 %) as it might be
a false match (imagine a bar with the same name of a famous landmark, for
example);
￿
a threshold was established; if the sum of relevance degrees for templates
matched to sentences was above the threshold, the information about content
location was kept, else it was thrown away. Currently the threshold we use for
this is of 90 %.
Search WWH ::




Custom Search