Geoscience Reference
In-Depth Information
subject to debate, however the notion of territory generally integrates a geographic
space composed of places (spatial component) as well as relations with different
subjects (thematic component) and/or references to a period (temporal component)”.
It also describes a second point of view, that of geomatics. “Geomatics is the
scientificfieldhoveringbetweengeographyandcomputersciencewhichmainlydeals
with problems of storage, processing and diffusion of geographic information. The
characterization of geographic information in a particular territory is defined in the
form of geographic entities (GEs) composed of spatial (SEs), temporal (TEs) and
thematic entities. It should be noted that each one of these entities is not always
specified or can be implicit”. Kergosien [KER 11] proposes an approach of ontology
construction as a tool for the structured representation of a territory but also as a
support to IR and to the browsing of document repositories.
- Examples of corpora: Territory is at the heart of numerous types of corpora. We
can quote, for example, the French-speaking corpus of archives, mainly composed of
texts,mapsandlithographiesrelatedtothecityofSaint-ÉtienneandtoitsriverFuran 1 ;
the multi-lingual corpus (German and French) of the Swiss Alpine Club 2 , composed
of reports, accounts, essays and thoughts under the theme of mountaineering; tourist
guidessuchasthedifferentrangesof Lonely Planet 3 topics or of the Michelin guide 4 ;
and the equally numerous hiking guides 5 and other travel blogs 6 .
These corpora have the principal characteristic of containing a very large number
of place names (spatial named entities will be defined further on); the places referred
to in such a way generally have a fine level of detail in a relatively confined space (a
river, a city and mountain range, for example). The Geotopia 7 and Text+Berg digital 8
projectsaregoodexamplesofthis.Theobjectiveofthefirstistoexperimentwithgeo-
referencingtechniquesinordertohelporganize,transmit,shareandinterpretarchival
data [JOL 11]. The second aims to digitize and promote a corpus of alpine literature
[VOL 10].
- The corpus of MIDR: MIDR 9 ,fromaperspectiveofculturalheritagepromotion,
has digitized and implemented the optical character recognition of its heritage
documentrepositorywiththeaimofindexingitintoadocumentretrievalsystem.This
way, the digitized documents can benefit from a renewed visibility and be exploited
1 http://umrisig.wordpress.com/les-projets/projet-geotopia.
2 http://www.textberg.ch.
3 http://shop.lonelyplanet.com.
4 http://voyage.viamichelin.fr.
5 http://www.ffrandonnee.fr/boutique/le-catalogue-des-topo-guides.aspx.
6 http://www.blogs-de-voyage.fr.
7 http://umrisig.wordpress.com/les-projets/projet-geotopia.
8 http://www.textberg.ch.
9 Médiathèque Intercommunale à Dimension Régionale de Pau Pyrénées - http://www.agglo-
pau.fr.
 
Search WWH ::




Custom Search