Geoscience Reference
In-Depth Information
the document i). The score of a retrieved document is, for example, determined by
calculating the scalar product [BAE 99, GOK 09].
Tile frequency
(TF)
freq(t,Du)
i=1 freq(t i )
W(t,Du)=TF(t,Du)=
NDu
NDu t
TF·IDF
W(t,Du)=TF(t,Du) ∗ IDF(t) with IDF(t)=log
(k 1 +1)∗TF(t,Du)
(K+TF(t,Du))
with K = k 1 ∗ [(1 − b)+ b∗n
OkapiBM25
W(t,Du)=
advl ]
W(t,Du)=TFp(t,Du)= freqP(t,Du)
TFp
i=1 freq(t i )
freq(t,Du): frequency of the tile t in the document unit Du
freqP(t,Du): continuous frequency of the tile t in the document unit Du
n: number of tiles in the document unit Du
i=1 freq(t i ):cumulated number of occurrences of tiles in the document unit Du
NDu t :number of document units related to the tile t
NDu:number of document units, k 1 = 1.2
b = 0.75, advl = 900
Table 3.3. Weighting formulas applied to the standardized indexes, for a tile t
and a document unit Du - taken from [PAL 10d]
T 1
T 2 ... T t
D 1
D 2
.
D n
w 11 w 21 ... w t1
w 21 w 22 ... w t2
. . .
w n1 w n2 ... w tn
Table 3.4. Vectorial model: document-tile matrix
Giventhattheinformationcanberepresentedviadifferentlevelsofgeneralization,
the proposed multi-level tiling allows us to use the index of tiles most adapted to the
range of the user's query.
Several tests described in [PAL 10a] will be discussed in section 3.5. These will
mainly allow us to verify that the loss of precision due to tiling does not degrade the
 
Search WWH ::




Custom Search