1.2.3. Access to textual content
A study conducted on IR tasks led by students has revealed that the three main
categories “of search criteria” are of bibliographical (people), chronological (periods)
and spatial (toponyms) types [MAN 09]. Many other studies show a considerable
proportion of references to places in the search criteria of users: for the
Excite [SAN 04], AOL [GAN 08] and Yahoo [JON 08] engines, this proportion
varies between 12.7 and 18.6%. Moreover, 79.5% of these queries contain
toponyms [SAN 04].
In the context of digital libraries (DLs), the interfaces of IR and navigation in the
resulting documents are by default composed of a subject (themes) and chronological
(see the Google Books, Europeana and Gallica projects) or subject, chronological
and spatial (see the Bibliothèque Numérique Mondiale project) dimensions. Here, the
IR process implements advanced document management tools. These document
management systems are based on metadata composed of descriptive instructions or
full-text indexes in which geographical information, toponyms among others, are
exploited in the same way as all the other terms.
Concerning the corpus of MIDR, a number of categories of use could be studied.
A qualitative study of the activities of librarians in the case of event-preparation
scenarios has allowed us to highlight IR approaches which prioritize, in order of
importance, the categories of bibliography (people), subject (themes), chronology
(periods) and place (toponyms). The usage scenarios of a tourist generally prioritize
the current location of the tourist or the intended place to visit in order to later focus
on the subject (themes), bibliography (people) or chronology (periods). We thus
distinguish three categories of users potentially involved in IR composed of
geographic criteria. Their basic knowledge is, a priori, decreasing. The first category
includes scholars, for instance historians, who wish to find precise information
related to a place or a date. It also encompasses librarians, for example, whose
purpose is the improvement of document annotation or the preparation of exhibitions.
The second category includes the inhabitants of a region who wish to know more
about it. It also affects teachers and their students, for example, who want to discover
the itinerary described in a travelogue. Finally, the third category includes tourists,
for example, who wish to determine the activities, the monuments or other points of
interest accessible in a given zone (“the canyons to the south of Laruns”, “the springs
around Pau”, etc.). It also involves every person wanting to find information from
spatial and temporal criteria.
We have highlighted the significant presence of geographic criteria in the IR
scenarios applied to Web content and DLs. Nevertheless, the usual search engines do
not allow us to take into account the particularities of spatial and temporal
information. Indeed, they are limited to the search terms (keywords) entered by the
