Access by Geographic Content to Textual Corpora: What Orientations? - Geographical Information Retrieval in Textual Corpora

Geoscience Reference

In-Depth Information

1

Access by Geographic Content to Textual

Corpora: What Orientations?

1.1. Introduction

The volume of digital corpora is always on the rise and the retrieval of relevant

documents is an increasingly delicate task. The ambiguity of natural language terms

contributes to this difficulty in the automatic interpretation of the expression of the

need for information as well as in the automatic evaluation of the correspondence

between documents and needs. The multiple meanings of the terms and their

numerous uses in varied contexts make delicate, indeed, the task of information

retrieval. Our working hypothesis therefore consists of distinguishing the spatial,

temporal and thematic dimensions in order to implement dedicated approaches in the

processes of indexing and information retrieval (IR). The objective is to contribute to

a better content analysis of textual corpora as well as to a better grasp of the search

criteria expressed in a query. Let us recall that we are studying textual corpora with

“territorial” denotations, digitized, to which processes of character recognition have

been applied but whose logical structure has not been conserved.

This chapter is organized as follows. Section 1.2 presents the general context

related to geographic information retrieval (GIR). Section 1.3 introduces privileged

fields of research as well as the position of our study. Section 1.4 gives a rough

sketch of our research approach in the construction of spatial, temporal and

multicriteria search engines.

1.2. Access by geographic content to textual corpora

The study concerning the processing of information in text is mainly detailed in

theses [BAZ 05, LES 07, PAL 10a, KER 11]. Following a number of reminders

related to document retrieval and textual corpora, we will describe the characteristics

Search WWH ::

Custom Search

Home