Database Reference
In-Depth Information
gain obtained by a user by going through a ranked list, from the top, up to a
given position. It allows for graded relevance, and discounts the gain received
at lower ranks to favor systems that place highly relevant documents near the
top of the ranked list. The DCG score at rank n is calculated as follows:
n
DCG ( n )=
G ( d i ,q ) / log b ( i + b
āˆ’
1)
(9.4)
i =1
where d i is the i-th document in the ranked list, G ( d i ,q ) is the graded relevance
of document d i with respect to the query q and parameter b is a pre-specified
constant to control the discount rates with respect to the position of each
document in the ranked list. The DCG score is normalized with respect to
the ideal (best possible) DCG to get the Normalized Discounted Cumulated
Gain (NDCG). To obtain a single score for the system's performance on a
query, the NDCG scores at all ranks are averaged. Given a test set of queries,
the per-query NDCG scores are further averaged to produce a global score.
In our evaluation scheme, we make two changes to the standard NDCG
metric, which we will describe in detail:
1. Replace graded document relevance G ( d i ,q ) with graded passage utility
U ( p i ,q ) that takes both nugget-based relevance and novelty into
account.
2. Penalize longer ranked lists to account for the effort spent by the user
in going through the system output.
9.4.2.1
Graded passage utility
To account for the presence of nuggets as well as whether the nuggets have
been seen by the user in the past, we calculate the gain received from each
passage in terms of utility U ( p i ,q ), instead of relevance G ( d i ,q ). Thus, we
define Discounted Cumulated Utility (DCU) as:
n
DCU ( n )=
U ( p i ,q ) / log b ( i + b
āˆ’
1)
(9.5)
i =1
which is normalized with respect to the ideal DCU to get the Normalized
Discounted Cumulated Utility (NDCU). U ( p i ,q )iscalculatedas:
U ( p i ,q )=
jāˆˆC ( p i )
w j
(9.6)
where C ( p i ) is the set of nuggets contained in passage p i , determined using
the rules as described in 9.4.1.2. Each nugget N j has an associated weight
w j , which determines the utility derived by seeing that nugget in a system-
produced passage. These weights are initially set to be equal, but could also
 
Search WWH ::




Custom Search