Geoscience Reference

In-Depth Information

discovery of territory targeting, for example pupils, tourists or scholars. We propose

dedicated models of spatial and temporal IR.

Figure 1.4. Spatial and temporal process flows

(extraction, validation and interpretation)

Second (Figure 1.5), in continuity with the first step, we have implemented a

multi-dimensional and a multicriteria IR model. We propose submitting each

criterion of a query to the IRS dedicated to the corresponding spatial, temporal and

thematic dimensions, followed by the combination of the results. Before any

combination, however, in order to avoid possible biases, we have chosen to

generalize the representation of the data corresponding to each dimension. This

generalization necessitates first and foremost the segmentation of the space

(respectively the period) covered by the document corpus to be indexed: we call this

spatial (respectively temporal) tiling, or splitting. This results in generalized indexes

(Figure 1.5). We then proceed to a projection in which every intersection between a

tile of the generalized index and an object of the initial index increases the weight of

the tile. We propose regular, administrative and calendar tilings with tiles of various

size.

Thus, this tiling approach, comparable to the generalization by truncation or

lemmatization of terms in classic IR approaches, allows us the implementation of

well-tried IR models for each of the geographic dimensions. We compare, for

example, vectorial IR to the ad hoc IR models developed for each dimension. Losses

in precision and recall are of course induced by the generalization. Nevertheless, the

integration of tile reference frequencies in the calculation of weight and relevance

scores delivers gains that we have quantified (see section 3.5.2, Chapter 3).

Search WWH ::

Custom Search