Geoscience Reference
In-Depth Information
discovery of territory targeting, for example pupils, tourists or scholars. We propose
dedicated models of spatial and temporal IR.
Figure 1.4. Spatial and temporal process flows
(extraction, validation and interpretation)
Second (Figure 1.5), in continuity with the first step, we have implemented a
multi-dimensional and a multicriteria IR model. We propose submitting each
criterion of a query to the IRS dedicated to the corresponding spatial, temporal and
thematic dimensions, followed by the combination of the results. Before any
combination, however, in order to avoid possible biases, we have chosen to
generalize the representation of the data corresponding to each dimension. This
generalization necessitates first and foremost the segmentation of the space
(respectively the period) covered by the document corpus to be indexed: we call this
spatial (respectively temporal) tiling, or splitting. This results in generalized indexes
(Figure 1.5). We then proceed to a projection in which every intersection between a
tile of the generalized index and an object of the initial index increases the weight of
the tile. We propose regular, administrative and calendar tilings with tiles of various
size.
Thus, this tiling approach, comparable to the generalization by truncation or
lemmatization of terms in classic IR approaches, allows us the implementation of
well-tried IR models for each of the geographic dimensions. We compare, for
example, vectorial IR to the ad hoc IR models developed for each dimension. Losses
in precision and recall are of course induced by the generalization. Nevertheless, the
integration of tile reference frequencies in the calculation of weight and relevance
scores delivers gains that we have quantified (see section 3.5.2, Chapter 3).
Search WWH ::




Custom Search