Spatial and Temporal Information Retrieval in Textual Corpora - Geographical Information Retrieval in Textual Corpora

Geoscience Reference

In-Depth Information

Equation[2.7]istranslatedintoanSQLquerycontainingGISfunctionsinorderto

perform the search and to score the selected document fragments. Finally, an ordered

list of document fragments constitutes this result.

If we take the example of a query containing the phrase “to the south of Pau”,

an RSF is recognized and a georeferenced footprint is computed. This representation

is compared to those contained in the index and, for example, a document fragment

containing “city of Gan” 27 is retrieved. The computed overlapping area corresponds

then to an instance of the core model: the intersection of the south of Pau and of Gan.

Yet,likethissentence,which,interpretedbyahuman,cangivea qualitative relevancy

score, its georeferenced representation gives a quantitative relevancy score.

We propose a calculation of spatial relevance supported by the bounding box

representations, which can be extended to other geometric primitives.

2.4.5.2. Temporal IR

The propositions in published literature relative to the calculation of similarity

scores in temporal IR are recent (see section 2.3.6). Since 2007 (see [LEP 07]), we

have proposed a similar approach to that developed for spatial IR.

Indeed, we have developed a model derived from the spatial IR model discussed

above. Therefore the process of temporal IR also involves the following steps:

1) Interpretation of the query. Thequeryisinterpretedusingthesameprocessflow

as for indexing. TFs are detected and then symbolic and numeric interpretations are

calculated.

2) Calculation of the set of results. Let Set req be the set of TFs annotated in

the query and Set doc the set of TFs annotated in the document. We have Set req =

{TF req } and Set doc = {TF doc }. Then, we calculate Set res , which is the set of

TFs of Set doc for which the intersection of their temporal representation and that of

a TF of Set req is not empty. We have Set res = {TF doc } with TF doc ∈ Set doc and

∃ TF req ∈ Set req suchthat representation(TF doc )∩representation(TF req ) = ∅.

The result of the query contains the set of document fragments to which the TFs of

Set res belong.

3) Calculation of the relevance score of each document fragment in the set of

results. We use the characteristics detailed in Figure 2.9 to measure the similarity

between a document fragment D f and a query Q.

Similarity(D f ,Q)= Precision(D f ,Q)+Overlapping(D f ,Q)

2+Distance(D f ,Q)

[2.9]

27 Gan is located 10 km to the south of Pau.

Search WWH ::

Custom Search

Home