Information Technology Reference
In-Depth Information
different collections and re-rank them as a new single list. In this section we compare
different state-of-the-art unsupervised merging algorithms on experiments using the
FedWeb 2012 dataset [ 26 ]. We first introduce in Sect. 4.4.1 the four different merging
algorithms used in our experiments. Then, we present the results of these experiments
by calculating information retrieval metrics (precision, recall, normalized discounted
cumulative gain) resulted from these approaches with different retrieval setting in
Sect. 4.4.2 .
4.4.1 Algorithms
In previous years, various algorithms were introduced that merged result lists from
different indices. In the remainder of this section, we introduce the most common
techniques for unsupervised merging algorithms, namely CORI, Weighted MinMax,
and round robin. We also introduce and compare our result merging algorithm, naive
merger, with these algorithms.
4.4.1.1 CORI
CORI was introduced by Callan et al. [ 6 ], who suggested to calculate the relevance
of a collection as weight and to use this as a parameter value to recalculate each
document score. It is important to note that CORI can also be used to rank collections
that we do not do in our study. Let R be the notation for collection (from which a
document is retrieved), d be a retrieved document, and q be defined as notation for
an incoming query, then CORI re-rank calculation is defined as:
1
+
0
.
4
.
s MinMax (
R
|
q
)
s norm (
d
|
q
) =
·
s MinMax (
d
|
q
)
(4.1)
1
.
4
The value 0.4 is proposed by the authors as the default value to define how important
the collection weight should be. CORI is considered to be a state-of-the-art algorithm
since experiments indicate that it is a robust unsupervised linear score normalization
method.
4.4.1.2 Weighted MinMax
Markov et al. [ 22 ] proposed a modification of the CORI algorithm, referred to as
weighted MinMax. In this paper the authors replace the constant 0.4, which repre-
sents the importance of a collection, with variable Lambda. The authors investigated
how result merging performance for CORI is influenced by varying the Lambda-
parameter. The authors concluded that by setting the Lambda-parameter to infin-
ity (
λ ₒ∞
) they can outperform other unsupervised linear score normalization
Search WWH ::




Custom Search