2.4. Proposition for spatial and temporal information indexing and retrieval in
textual corpora
We present the spatial and temporal IR platform that we have developed. It is
mainly based on the spatial and temporal data representation models as well as the
spatial and temporal information indexing and retrieval process flows that we
designed. Referred to as PIV, this platform is tested on a corpus composed of
travelogues relative to the Pyrenean ranges, written in French.
2.4.1. Reminder and focus on the notion of space and time in “heritage” corpora
We illustrate the spatial and temporal expressions typical of our corpus with a few
examples. We mainly deal with toponymic and calendar expressions.
From a spatial point of view, we aim at the recognition and interpretation of
geographic zones mentioned in the texts. Here are some examples of spatial
expressions extracted from our documents:
Pau; the Cerdagne; the inner city of Lourdes; the woods of Perthus; the
pastures of Pourtalet; the ridges of the Canigou range; the footpath of
Cadeilhan Trachères; in the forest of Iraty; at the foot of the Bastanet pass;
above Aragnouet; to the right of Oloron; more than two hours from Oule;
toward the Literola lake; near the Pic de Rouille; away from the road of
Perthus; not far from Balledrayt hut; the region of Vallier; the slope of Ariège;
close to the Saint-Béat and Luchon cantons; on the outskirts of the lowest
passes; to the west; the arid land to the south of the Aragon region.
Some expressions refer to the entries in resources relative to the Pyrenean
territory (IGN, collaborative resources like OpenStreetMap or free resources like
GeoNames): these are the toponymic names (Pau) or extended toponyms specifying
the type of the spatial entity thus described (woods of Perthus). We will call these
equally absolute entities or absolute features. Others correspond to an adaptation of
one or more absolute spatial entities (next to the Rouille pass), we will call these
relative spatial entities or relative spatial features (RSFs) [SAL 08]. Finally, the
spatial entities can be complete or incomplete. Incomplete entities (on the outskirts of
the lowest passes) cannot be located on a map since an elementary analysis, limited
to the noun phrase, does not allow their geolocalization.
in the documents, we are only interested in calendar-like temporal expressions. Here
are some examples of temporal expressions that can be found in our documents:
