Database Reference
In-Depth Information
Table 4. Linear regression of different metric values in TREC 9 (dependent variable
is ED(C))
R 2
Metric Constant
Linear
Significance
coe cient
level
AP
9.013
-.675
0.493
.000
RP
9.013
-.548
0.439
.000
NDCG
8.972
-.399
0.664
.000
P10
9.021
-.641
0.389
.000
Table 5. Linear regression of different metric values in TREC 2001 (dependent variable
is ED(L))
R 2
Metric Constant
Linear
Significance
coecient
level
AP
18.412
-2.682
0.760
.000
RP
17.749
-2.403
0.639
.000
NDCG
18.736
-1.840
0.799
.000
P10
18.243
-1.500
0.484
.000
Table 6. Linear regression of different metric values in TREC 2001 (dependent variable
is ED(C))
R 2
Metric Constant
Linear
Significance
coecient
level
AP
4.688
-1.227
0.906
.000
RP
4.726
-1.173
0.868
.000
NDCG
4.838
-0.845
0.962
.000
P10
4.633
-0.784
0.754
.000
used different thresholds for them so as to let the differentiation rate be in the
range that we are interested. For the Euclidean distance, we used 10 thresholds
(0.1%, 0.2%, ..., 1%); while for ranking based metrics, we used 10 thresholds
(6%, 9%, ..., 33%). In TREC 9, there are 53 runs and the number of all possible
pairs of runs is 1431. In TREC 2001, there are 34 runs and the number of all
possible pairs of runs is 561. Let us take TREC 2001 as an example. Assuming
foragiventhreshold T and a given metric there are n pairs of runs whose
performance difference is above T from all possible 561 pairs. In this situation,
the differentiation rate of the metric m is n/ 561, for the given threshold T .
Tables 7-10 show the experimental results. It is obvious that a good metric
should have high differentiation rates and low error rates at the same time,
though these two aspects are somewhat conflicting. Therefore, we define the
“overall quality” of a metric as D rate/E rate for any given threshold. Here
D rate is the differentiation rate and E rate is the error rate.
Let us compare the two variations of the Euclidean distance and then the
four ranking based metrics separately. From the angle of differentiation rate, the
cubic model performed better than the linear model in TREC 9 but worse than
the linear model in TREC 2001; as for error rate, the cubic model performed not
Search WWH ::




Custom Search