Using the Euclidean Distance for Retrieval Evaluation - Advances in Databases

Database Reference

In-Depth Information

s ( i )= a 0 + a 1 ∗ ln ( i )+ a 2 ∗ ln ( i ) 2 + a 3 ∗ ln ( i ) 3

(2)

In Equation 2, s ( i ) is the relevance score of the document at rank i . a 0 , a 1 , a 2 ,

and a 3 are 4 parameters.

Therefore, for any result, we can always evaluate it using the Euclidean dis-

tance after appropriate pre-processing. Some variations of the Euclidean distance

can also be defined, see [10] for detailed discussion.

2

Investigation Objectives and Methodologies

The aims of the study are twofold: one is to evaluate the Euclidean distance,

which is introduced in this paper; the other is to evaluate those ranking based

metrics in the environment that 3 graded relevance judgment is used. There have

been quite a few empirical investigations for those metrics when binary relevance

judgment is used (e.g., in [2,7,8,13]). However, up to now very little has been

done for them when relevant judgment methods other than binary relevance

judgment are used.

Apart from the Euclidean distance, we also consider 4 ranking based metrics: AP,

RP, NDCG, and P10, because they are four of the most commonly used metrics for

retrieval evaluation. Making a comparison of these two types of metrics is helpful

for us to have a better understanding of the characteristics of them.

For readers' convenience, we discuss how these metrics are defined. First let

us see how to define these metrics involved when binary relevance judgement is

used. Suppose for a query Q , an information retrieval system returns a list of

documents R .Thereare total r relevant documents in the whole collection. AP

is defined as

total r

i

p i

1

total r

AP =

i =1

Here p i is the ranking position of the i -th relevant documents in the resultant

list R . One thing needs to be noticed is: usually a very small percentage of

documents in the whole collection are retrieved and included in any result, thus

it is very likely that less than total r relevant documents will appear in such a

result. Then we just assume that those missing relevant documents will never

appear and their contribution to the value of AP is ignored. For example, if t

relevant documents appear in R , then AP can be defined as

t

1

total r

i

p i

AP =

i =1

For example, if there are 4 relevant documents in the whole collection and 2

of them are retrieved in the ranking positions of 2 and 4 in R ,thenAP=

1/4*(1/2+2/10) = 0.175.

RP is defined as the percentage of relevant documents in the top

total r

documents in R .

Advances in Databases

Search WWH ::

Custom Search

Home