Geoscience Reference
In-Depth Information
the document i). The score of a retrieved document is, for example, determined by
calculating the scalar product [BAE 99, GOK 09].
Tile frequency
(TF)
freq(t,Du)
i=1
freq(t
i
)
W(t,Du)=TF(t,Du)=
NDu
NDu
t
TF·IDF
W(t,Du)=TF(t,Du) ∗ IDF(t) with IDF(t)=log
(k
1
+1)∗TF(t,Du)
(K+TF(t,Du))
with K = k
1
∗ [(1 − b)+
b∗n
OkapiBM25
W(t,Du)=
advl
]
W(t,Du)=TFp(t,Du)=
freqP(t,Du)
TFp
i=1
freq(t
i
)
freq(t,Du): frequency of the tile t in the document unit Du
freqP(t,Du): continuous frequency of the tile t in the document unit Du
n: number of tiles in the document unit Du
i=1
freq(t
i
):cumulated number of occurrences of tiles in the document unit Du
NDu
t
:number of document units related to the tile t
NDu:number of document units, k
1
= 1.2
b = 0.75, advl = 900
Table 3.3. Weighting formulas applied to the standardized indexes, for a tile t
and a document unit Du - taken from [PAL 10d]
T
1
T
2
... T
t
D
1
D
2
.
D
n
w
11
w
21
... w
t1
w
21
w
22
... w
t2
. . .
w
n1
w
n2
... w
tn
Table 3.4. Vectorial model: document-tile matrix
Giventhattheinformationcanberepresentedviadifferentlevelsofgeneralization,
the proposed multi-level tiling allows us to use the index of tiles most adapted to the
range of the user's query.
Several tests described in [PAL 10a] will be discussed in section 3.5. These will
mainly allow us to verify that the loss of precision due to tiling does not degrade the
Search WWH ::
Custom Search