Geoscience Reference
In-Depth Information
of corpora with “territorial” denotations and their uses. This category of corpora will
constitute the field of experimentation for our propositions.
1.2.1. Document retrieval and textual corpora
Document retrieval or information retrieval [BAE 99, BOU 08] is traditionally
defined as a set of techniques allowing us to select, from a collection of documents,
information that is likely to meet the needs of the user.
A collection of documents (document repository or corpus) is the information
accessible via the document retrieval system (or information retrieval system, IRS). It
consists of documents, unit elements. Textual documents are represented by a set of
descriptors (terms, for example) stored in files of descriptive instructions (metadata)
or indexes whose structure can be more complex [BES 04]. However, the notion of
document in itself is vague. Generally defined by its container (e.g. a topic, the
physical object that contains the text), it often varies and the expected result of a
query may not be an entire topic but one or more particularly relevant fragments.
This is indeed the reason why we use the expression “document unit” or “document
fragment” to define the unit of text returned to the user [BAZ 05].
Finally,a query correspondstotheexpressionoftheinformationneedsoftheuser.
It constitutes the input parameter to the retrieval system and is expressed in a query
languagethatisoftensimple: achoiceofkeywordsandlogicaloperators,forinstance.
Nevertheless, other languages are presented in literature: natural language, graphical
language, etc. [GOK 09].
1.2.2. Textual corpora with “territorial” denotations
Atextualcorpuswith“territorial”denotationsiscomposedoftravelogues,stories,
newspapers, novels, poems, etc. These documents describe/discuss a territory. As
detailed in [KER 11], the territorial dimension is symbolized in textual documents
by a significant frequency of toponyms, outlined facts or described observations.
Toponyms denote, for example, streams, cities and buildings. The facts describe,
for example, political or sport-related events as well as various other events. The
observations refer to architecture, botany, geology, agriculture, etc. These categories
of information are, in a general way, linked to a location or a period of time.
- Territory: The Longman Dictionary defines the term territory as “an area for
which one person or branch of an organization is responsible”. Kergosien [KER 11]
presents a consistent overview of the notion of territory. Among the different
definitions proposed, we will retain the following two [KER 11, p. 70]: “A globally
accepted definition in geography describes territory as a space on which an authority
is exercised and is limited by political and administrative borders. This definition is
Search WWH ::




Custom Search