Information Technology Reference
In-Depth Information
Table 4. Retrieval effectiveness of the different
approaches
retrieval effectiveness
All the retrieval experiments have been carried
out using the same retrieval engine, which is based
on the Vector Space Model and implements the
tf idf weighting scheme described previously.
The results in terms of retrieval effectiveness are
presented in Table 4, where the average precision
(Av.Prec.), the percentage of queries that gave the
relevant document within the first k positions ( k
with values in [1,3,5,10]), and the ones that did not
retrieve the relevant document at all (“not found”),
are reported. It has to be noted that retrieval
effectiveness is usually reported through preci-
sion/recall plots, using the rank list of retrieved
documents. The particular choice of parameters
adopted in this experiment depends on the fact
that there was only one relevant document for
each query in the test collection.
FL and DD had comparable results, because
they have close average precision and both ap-
proaches always retrieved the relevant document
within the first ten positions (DD within the first
three, but with a lower percentage of queries that
retrieved the relevant document at top rank). Also
PB and MO had comparable results in terms of
average precision; with slightly better perfor-
mances of MO, in particular because PB did
FL
DD
PB
MO
Av.Prec.
0.98
0.96
0.80
0.83
= 1
97.5%
92.5%
72.5%
77.5%
≤ 3
97.5%
100%
87.5%
87.5%
≤ 5
97.5%
100%
87.5%
92.5%
≤ 10
100%
100%
90.0%
95.0%
not found
0.0%
0.0%
10.0%
2.5%
which for many subjects gave a minimum length
of lexical units of about four to five notes. Another
interesting feature is the average number of differ-
ent lexical units for documents that range from 43.2
for PB to 61.9 for DD. Given that these values are
computed for complete music documents, even if
only on the melodic line, music indexing is based
on a very compact description of document con-
tents, at least compared with indexing of textual
documents that, also in the case of short docu-
ments, have hundreds of different index terms.
The last row reports the number of different lexical
units that corresponds to the number of entries
in the index file. As it can be seen, segmentation
with overlapping units with different lengths (DD)
has the drawback of an increase of the index size
and in memory requirements.
Figure 6. Retrieval effectiveness of the different approaches depending on the number of errors added
to the query (left) and on the shortening of query length (right)
1
1
0.9
0.9
0.8
0.8
0.7
0.7
0.6
0.6
0.5
FL
DD
PB
MO
FUS
0.5
FL
DD
PB
MO
FUS
0.4
0.4
0.3
0.3
100%
90%
80%
70%
60%
50%
correct
1 error
2 errors
3 errors
 
Search WWH ::




Custom Search