Geography Reference
In-Depth Information
information more meaningful. Methods for automatically processing spatiotemporal
and semantic information from text documents are discussed in this chapter.
The rest of this chapter is organized as follows: The next section reviews related
work in the field of GIR and GIScience. In the section that follows, the main features
of an approach that automatically extracts spatiotemporal information from text doc-
uments are introduced, and algorithms for combining spatiotemporal information to
model the geographic dynamics are demonstrated. A hazard ontology is presented
that can be used for modeling the semantics associated with storm events, followed
by a section that demonstrates the work for a tornado outbreak in central US in April
2012 as described in web news reports. Evaluation is conducted based on comparing
results obtained from manual and automated processed performances. Conclusions
are discussed in the final section of the chapter.
10.2
Background
In this research, a hybrid method is developed to combine gazetteers and ontologies
for extracting semantics of hazard-related events sourced from web texts, such
that not only spatiotemporal information is automatically represented and tracked
over space-time, but also text-based semantics are automatically processed, adding
meaning to spatiotemporal information. It should be noted that the term 'gazetteer'
is applied more broadly in NLP than the traditional usage of the term in Geography,
i.e., a gazetteer refers to a dictionary that contains lists of geographic references.
Instead, gazetteer refers to a list of specific terms that are used to match the
corresponding information from text documents. In this work, a spatial gazetteer
refers to a list of spatial terms commonly found in text, and a temporal gazetteer
contains a list of temporal phrases.
A key objective of GIR is to detect and capture location-based information from
natural language text. Most GIR systems are based on detecting spatial references
in text, for example, latitude, longitude, state, county, and city names. To extract
geographic information from text documents, a spatial gazetteer is a key element
for data processing and affects the accuracy of extraction results. Specifically, the
references in the documents are compared with the terms in the gazetteers and, if a
match is found, those words or phrases from text documents are annotated by the
NLP system. Numerous systems have been developed based on GIR techniques.
For example, GIPSY , a geo-referenced information processing system, supports
automatic geographic indexing of text documents (Woodruff and Plaunt 1994 )
by matching terms in the document to terms in a gazetteer. Since GYPSY ,the
idea of designing and applying gazetteers has become a standard component in
many GIR systems. NewsStand , another GIR system, detects geographic-related
information from RSS feeds using a custom-built geotagger (Teitler et al. 2008 ).
The extracted locations are displayed via a map viewer that dynamically displays
the primary location associated with each news article based on the frequency of
Search WWH ::




Custom Search