First, the integration of the geographic ontology GéOnto (see project GEONTO,
Chapter 1) in the semantic processing stage has allowed an improvement of the
quality of the classification of SFs and, consequently, of their retrieval in the
geographic resources followed by the calculation of the corresponding numeric
representations, in the next stage. The improvement of the interpretation quality
(geolocalization) of the SFs is equal to 85%. This is easily explained with the
example of the toponym “torrent of Cauterets”, which, in the standard flow, is
geolocalized by the geometry relative to the commune of Cauterets and, in the flow
using GéOnto, is classified as an hydronym and geolocalized by the corresponding
stream geometry. Moreover, the overall gain in time of the whole process is equal to
28% and the gain in time concerning the operation of geolocalization of an SF is
equal to 84%. Indeed, the typing of the entities allows us to directly query the
corresponding resource/table in the IGN gazetteer. This has a double effect:
respecting the meaning attached to the corresponding toponym and increasing the
parsing speed of the resource.
Another consequent evolution concerns the utilization of geometric primitives in
order to refine the RSF representations initially approximated by the bounding boxes.
We localize it in the continuity of the propositions made by Malandain [MAL 00]
and by Fu et al. [FU 05] for the interpretation of information units extracted from
textual or iconographic documents. Thus, our work described in [LES 07] and
[SAL 08] proposes specific configurable functions taking advantage of GIS
functionalities and fuzzy notions for new spatial representations of adjacency and
inclusion relationships. These new algorithms improve the precision of the numeric
representations of RSFs and reduce the problem of noise due to the approximations
by the bounding boxes.
The quality of the spatial PIV IRS depends largely on these pieces of work on the
improvement of ES typing with the help of ontological resources as well as on the
precision of the numeric representation of the RSFs.
Let us note that, as in the spatial case, the creation followed by the usage of
praxonymic temporal resources (historical events, diseases and cultural events)
relative to the 18th and 19th Centuries and to the Pyrénées mountains could allow the
improvement of the temporal PIV IRS.
All these orientations are at the core of the concerns and work of the T2I team, of
the laboratory of LIUPPA, concerning spatial and temporal informations in textual
documents. The propositions targeting the creation of a multicriteria IR system
combining the spatial, temporal and thematic dimensions are based on this first
foundation of results.
As we have just emphasized, the approaches presented in this section are better
suited than those supported by classic IRSs for the retrieval of documents based on